Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Enhancements and updates to the discussion of fuzzing in the testing document. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA3-256: |
3bf90e0caaaa351c6fea10d13659b20f |
User & Date: | drh 2019-12-21 17:23:35.167 |
Context
2019-12-24
| ||
02:33 | Document the fact that the R-Tree extension is not capable of doing simultaneous reads and writes. (check-in: a8afb2f905 user: drh tags: trunk) | |
2019-12-21
| ||
17:23 | Enhancements and updates to the discussion of fuzzing in the testing document. (check-in: 3bf90e0caa user: drh tags: trunk) | |
15:22 | Enhance CLI documentation to talk more about reading ZIP archives as database files. (check-in: e92fb43396 user: drh tags: trunk) | |
Changes
Changes to pages/testing.in.
︙ | ︙ | |||
368 369 370 371 372 373 374 | wildly nonsensical SQL statements and feeding them to SQLite to see what it will do with them. Usually some kind of error is returned (such as "no such table"). Sometimes, purely by chance, the SQL statement also happens to be semantically correct. In that case, the resulting prepared statement is run to make sure it gives a reasonable result.</p> | < < < < < > > > | < | | | | < < < < < < < | < | 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 | wildly nonsensical SQL statements and feeding them to SQLite to see what it will do with them. Usually some kind of error is returned (such as "no such table"). Sometimes, purely by chance, the SQL statement also happens to be semantically correct. In that case, the resulting prepared statement is run to make sure it gives a reasonable result.</p> <tcl>hd_fragment aflfuzz {American Fuzzy Lop fuzzer} {AFL}</tcl> <h3>SQL Fuzz Using The American Fuzzy Lop Fuzzer</h3> <p>The concept of fuzz testing has been around for decades, but fuzz testing was not an effective way to find bugs until 2014 when Michal Zalewski invented the first practical profile-guided fuzzer, <a href="http://lcamtuf.coredump.cx/afl/">American Fuzzy Lop</a> or "AFL". Unlike prior fuzzers that blindly generate random inputs, AFL instruments the program being tested (by modifying the assembly-language output from the C compiler) and uses that instrumentation to detect when an input causes the program to do something different - to follow a new control path or loop a different number of times. Inputs that provoke new behavior are retained and further mutated. In this way, AFL is able to "discover" new behaviors of the program under test, including behaviors that were never envisioned by the designers. <p>AFL proved adept at finding arcane bugs in SQLite. Most of the findings have been assert() statements where the conditional was false under obscure circumstances. But AFL has also found a fair number of crash bugs in SQLite, and even a few cases where SQLite computed incorrect results. <p>Because of its past success, AFL became a standard part of the testing strategy for SQLite beginning with [version 3.8.10] ([dateof:3.8.10]) until it was superceded by better fuzzers in [version 3.29.0] ([dateof:3.29.0]). <tcl>hd_fragment ossfuzz {OSS Fuzz}</tcl> <h3>Google OSS Fuzz</h3> <p>Beginning in 2016, a team of engineers at Google started the [https://github.com/google/oss-fuzz|OSS Fuzz] project. OSS Fuzz uses a AFL-style guided fuzzer running on Google's infrastructure. |
︙ | ︙ | |||
436 437 438 439 440 441 442 | [https://github.com/google/fuzzer-test-suite/blob/master/tutorial/structure-aware-fuzzing.md|Structure-Aware Mutator] on a specialized input file that defines both an input database and SQL text to be run against that database. Because it mutates both the input database and the input SQL at the same time, dbsqlfuzz has been able to find some obscure faults in SQLite that were missed by prior fuzzers that mutated only SQL inputs or only the database file. | | < | > > > > > > > > | > > | > > > > > > > > | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 | [https://github.com/google/fuzzer-test-suite/blob/master/tutorial/structure-aware-fuzzing.md|Structure-Aware Mutator] on a specialized input file that defines both an input database and SQL text to be run against that database. Because it mutates both the input database and the input SQL at the same time, dbsqlfuzz has been able to find some obscure faults in SQLite that were missed by prior fuzzers that mutated only SQL inputs or only the database file. <p>The SQLite developers usually leave an instance or two of dbsqlfuzz running on the latest trunk code of SQLite whenever they are away from the office for an extended period, such as overnight. The dbsqlfuzz fuzzer has proven so effective that OSSFuzz has ceased to find new problems in SQLite, except for cases when a new bug is introduced and OSSFuzz finds it overnight before dbsqlfuzz has had a chance to run. (Examples: [https://www.sqlite.org/src/timeline?y=ci&c=c422afb507dc8757|[1]] [https://www.sqlite.org/src/timeline?y=ci&c=0a2eb949f8a759e5|[2]] [https://www.sqlite.org/src/timeline?y=ci&c=62f2235adf796c72|[3]]) <tcl>hd_fragment 3pfuzz {3rd-party fuzzers}</tcl> <h3>Other third-party fuzzers</h3> <p>SQLite seems to be a popular target for third-parties to fuzz. The developers hear about many attempts to fuzz SQLite and they do occasionally get bug reports found by independent fuzzers. All such reports are promptly fixed, so the product is improved and that the entire SQLite user community benefits. This mechanism of having many independent testers is similar to [https://en.wikipedia.org/wiki/Linus%27s_law|Linus's law]: "given enough eyeballs, all bugs are shallow". <p>One fuzzing researcher of particular note is <a href="https://www.manuelrigger.at/">Manuel Rigger</a>, currently (as this paragraph is written on 2019-12-21) at <a href="https://ethz.ch/en.html">ETH Zurich</a>. Most fuzzers only look for assertion faults, crashes, undefined behavior (UB), or other easily detected anomalies. Dr. Rigger's fuzzers, on the other hand, are able to find cases where SQLite computes an incorrect answer. Rigger has found [https://www.sqlite.org/src/timeline?y=t&u=mrigger&n=all|many such cases]. Most of these finds are fairly obscure corner cases involving type conversions and affinity transformations, and a good number of the finds are against unreleased features. Nevertheless, his finds are still important as they are real bugs, and the SQLite developers are grateful to be able to identify and fix the underlying problems. Rigger's work is currently unpublished. When it is released, it could be as influential as Zalewski's invention of AFL and profile-guided fuzzing. <tcl>hd_fragment fuzzcheck fuzzcheck</tcl> <h3>The fuzzcheck test harness</h3> <p>Historical test cases from [AFL], [OSS Fuzz], and [dbsqlfuzz] are collected in a set of database files in the main SQLite source tree and then rerun by the "fuzzcheck" utility program whenever one runs "make test". Fuzzcheck only runs a few thousand "interesting" cases out of the hundreds of millions of cases that the various fuzzers have examined over the years. "Interesting" cases are cases that exhibit previously unseen behavior. Actual bugs found by fuzzers are always included among the interesting test cases, but most of the cases run by fuzzcheck were never actual bugs. <tcl>hd_fragment tension {coverage testing vs. fuzz testing}</tcl> <h3>Tension Between Fuzz Testing And 100% MC/DC Testing</h3> <p>Fuzz testing and [MC/DC|100% MC/DC testing] are in tension with one another. That is to say, code tested to 100% MC/DC will tend to be more vulnerable to problems found by fuzzing and code that performs well during fuzz testing will tend to have (much) less than 100% MC/DC. This is because MC/DC testing discourages [defensive code] with unreachable branches, but without defensive code, a fuzzer is more likely to find a path that causes problems. MC/DC testing seems to work well for building code that is robust during typical normal use, whereas fuzz testing is good for building code that is robust against malecious attack. <p>Of course, users would prefer code that is both robust in normal use and resistent to malcious attack. The SQLite developers are dedicated to providing that. The purpose of this section is merely to point out that doing both at the same time is hard. <p>For much of its history SQLite has been focused on 100% MC/DC testing. Resistence to fuzzing attacks only became a concern with the introduction of AFL in 2014. For a while there, fuzzers were finding many problems in SQLite. In more recent years, the testing strategy of SQLite has evolved to place more emphasis on fuzz testing. We still maintain 100% MC/DC of the core SQLite code, but most testing CPU cycles are now devoted to fuzzing. <p>While fuzz testing and 100% MC/DC testing are in tension, they are not completely at cross-purposes. The fact that the SQlite test suite does test to 100% MC/DC means that when fuzzers do find problems, those problems can be fixed quickly and with little risk of introducing new errors. <h2>Malformed Database Files</h2> <p>There are numerous test cases that verify that SQLite is able to deal with malformed database files. These tests first build a well-formed database file, then add corruption by changing one or more bytes in the file by some means |
︙ | ︙ | |||
574 575 576 577 578 579 580 | <p>Any one of the above test cases would provide 100% statement coverage but all three are required for 100% branch coverage. Generally speaking, 100% branch coverage implies 100% statement coverage, but the converse is not true. To reemphasize, the [TH3] test harness for SQLite provides the stronger form of test coverage - 100% branch test coverage.</p> | | | 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 | <p>Any one of the above test cases would provide 100% statement coverage but all three are required for 100% branch coverage. Generally speaking, 100% branch coverage implies 100% statement coverage, but the converse is not true. To reemphasize, the [TH3] test harness for SQLite provides the stronger form of test coverage - 100% branch test coverage.</p> <tcl>hd_fragment defcode {defensive code}</tcl> <h2>Coverage testing of defensive code</h2> <p>A well-written C program will typically contain some defensive conditionals which in practice are always true or always false. This leads to a programming dilemma: Does one remove defensive code in order to obtain 100% branch coverage?</p> |
︙ | ︙ |