Please contribute to the notes! https://etherpad.fr/p/LPC2014_DevTools Development Tools Microconference Notes Welcome to Linux Plumbers Conference 2014. Please use this etherpad to take notes. Microconf leaders will be giving a summary of their microconference during the Friday afternoon closing session. Please remember there is no video this year, so your notes are the only record of your microconference. 9:30 - 9:50 Coccinelle: A program matching and transformation tool: Himangi Saraogi http://coccinelle.lip6.fr/ - Computational Tree Logic - Collateral evolution Q: Does it generate commit messages? Julia: there is a tool inside Coccinelle that will look for changes that share the same maintainer - splitpatch - it runs 'git log -- oneline' Q: Any plans on improving the error messages? Julia: I'm not very motivated to change the parser generator, and that is what generated the error message that Himangi cited Q: Is Coccinelle usable for static analysis? Julia: Coccinelle just matches patterns against code - It does not do any alias analysis - the niche of Coccinelle is that it is very easy to write your own rules - "Consider paying Coverity" Q: How to check only new changes? Julia: Coccinelle has an option that can check uncommitted changes - In the future we may add the ability to check patches - In the future there might be a way to automatically detect which rules apply to a particular patch Q: Can this be used to transform C code into something else? Julia: It won't work Q: Is there a tool that converts a C patch into a semantic patch because SmPL is too hard? Julia: There is a tool called "spdiff" that can do that for simple cases - spdiff takes a bunch of patches and generates a semantic patch 9:50 - 10:10 Backporting the kernel with SmPL: Luis Rodriguez backports.git on k.org irc.freenode.net #kernel-backports - "well over 800 drivers" - compromises: - DRM drivers dropped - regulator framework dropped - "only carry things folks need" - only support kernels >= 3.0 - this is for upstream drivers, not proprietary drivers backports project uses Coccinelle daily - unusual - "stressing Coccinelle" Formerly, folks would add kernel version ifdefs while backporting - results in long patches - hard to review them Instead, add helper functions that encapsulate the #ifdefs - one patch for one single upstream commit - track linux-next daily ? can it be done automatically? -> Coccinelle With Coccinelle, one does not have to deal with hunk context at all - How about hard stuff? - threaded IRQs? - added in 2.6.31 - added struct compat_threaded_irq - extended the data structure - added compat_request_threaded_irq() Pay attention to the private argument - "some kind of pointer" - private for each device driver - 'private' data structure type was inferred by Coccinelle - upstream static inlines would also have helped - use static inline functions for structure accessors How can one be 100% certain that this transformation is correct? - verify against the original hand-created backport Coccinelle revealed inconsistencies in the original backport Future work: - linux-next conflicts required manual fixing - once every two weeks - manual intervention required - can this be automated? Solicitation for funding for this project at InRIA Q: Has this been used for backporting only drivers, or also subsystems? A: Both "Atomic patches" = a patch for a single collateral evolution Q: What is the strategy of backporting? A: Linux kernel developers should not have to do backporting - it should be done automatically Q: Why are backports necessary? A: "a lot of people can't use the latest kernel" Q: Backports create a package that you compile against any kernel A: yes 10:10 - 10:35 How to fix bugs automatically? Martin Monperrus Automatic software repair Software has a lot of bugs We spend a lot of time fixing them How to detect them? How to generate a patch to fix them automatically? There are simple bugs Is it possible to devise algorithms that take a description of the bug and generate a proposed patch? Perhaps generate several different patches and allow the developers to choose between them? (speaker drawing comparisons to "le merveilleux" recipes) Choose a class of bugs - Buffer overflow? Unhandled exception? Infinite loops? Crashes? Memory leaks? API misusage? Buggy conditions? * focusing on buggy conditions, missing preconditions * students focusing on infinite loops and exceptions Identify a good bug oracle - A bug oracle tells you "yes there is a bug" or "no it is fixed" - examples: crashing input, failing test cases (Java) - enables trial-and-error automatic generation of attempted fixes A good bug oracle is automated and does not take too long to run Set up repair operators - "a generic modification on source code" - e.g., adding preconditions "Some repair operators are surgical, others are very violent" - "In some cases one needs a regression oracle" - a strong test suite is needed Q: why is the bug oracle separate from the regression oracle? Why not add your bug oracle to the set of regression oracles? A: to maintain performance during the automatic fix generation phase CPU usage - 10 minutes in one example, one hour in another Q: testing .NET apps is different than testing the kernel - one may introduce a regression that is not covered by the existing test suite A: agrees Some bug oracles can be embedded as runtime assertions But in most cases the intention is to make recommendations to developers Does it work? - "only the beginning" - "~100 real bugs have been fixed automatically" Q: How many of the ~100 bugs have been accepted by the maintainers? A: Well all of our development has been on existing bugs with existing fixes, to try to verify our approach Q: what licenses do your team use? A: mostly GPL but it depends List of publicly available tools: http://www.monperrus.net/martin/automatic-software-repair-tools 10:40 - 11:05 GCC and LLVM collaboration review: Renato Golin Had a meeting between the GCC and LLVM teams "Had a bit of a clash" Disagreement on licenses, agreement on technical issues Problems: - gcc behaves in one way, clang behaves in another Would like to standardize public interfaces Much is shared: languages, assembly language, many command line options, libraries, binutils, LTO, PGO Common projects: gas, ld, glic/newlib, sanitizers What about: - a common user interface? - better documentation of extensions? - inline assembly? - avoiding unintended side effects that surprise users? Driver: common user interface - Command line flags, mostly - Target descriptions can have subtle differences (e.g., arm-linux-gnueabihf-gcc) - Return values Behan: clang is a single compiler that targets multiple architectures, but gcc binaries are built separately for each target Extension standards - Inline assembler, attribute, language extensions, compiler extensions - LLVM will never implement VLAIS - LLVM will never implement nested functions Geting involved: http://clang.llvm.org/get_involved.html "gcc is not the C standard" - consider compiling your code with both gcc and clang - one compiler may have different warnings than the other 11:05 - 11:30 Testing Multiple Architectures Using QEMU: Christopher Covington testing with QEMU (qemu-system-$arch) cannot replace real machines but testing with QEMU should replace "no testing at all" existing users: Aboriginal linux, strace, virtme (https://github.com/amluto/virtme) case study: CRIU building: use native toolchain or cross-compiing, out-of-tree build support is fine but not necessary run-time deps: root fs can be initramfs, image (genext2fs recommended), network/passthrough (works for >5G) - virtio/9pfs - needed to work around problems related to image size (> 5 GB?) Example QEMU command lines: http://www.spinics.net/lists/linux-virtualization/msg23031.html - speed: optimize CFLAGS, run only what needed, static better than dynamic - but presenter has not measured yet broonie: others are looking at the same approach (Grant Likely/Shuah Khan?) luisr: also does the same thing, but not across separate architectures - attempt is to use this "to isolate oopses" Does linux-next uses emulation/virtualization for for testing? How to deal with the rootfs? prepackaged or build from scratch? - Support both? Take a checkpoint of a booted-up system that's about to load a testcase http://www.quicinc.com/ 11:30 - 11:50 Undertaker: Valentin Rothberg undertaker.cs.fau.de How to avoid #ifdef bugs in the Linux kernel kernel/smp.c: #ifdef CONFIG_CPU_HOTPLUG example - leaks memory! * CONFIG_CPU_HOTPLUG does not exist! * CONFIG_HOTPLUG_CPU is the right option Undefined CPP identifiers evaluate to false - can lead to dead/undead #ifdef blocks Kconfig identifiers have the same problem e.g. dependencies on undefined identifiers How big is the problem? - on average, 600-700 dead/undead blocks! - 9k-10k dead/undead SLOC - at least the problem is getting smaller per kernel version... How to avoid this? - new tool: undertaker-checkpatch - checks git commits for #ifdef bugs - can be used like checkpatch.pl - can detect if bugs are added or repaired Q: can this be run on a source tree like checkpatch.pl, or does it require a git tree? A: requires a git tree mach-omap2/board-h4.c example Kconfig changes are critical - can find cases where Kconfig lines are removed, but references persist in the tree Can also find logical constraints - nested defines where the inner #ifdef mandates a certain evaluation of the outer #ifdef - so an #else arm of the inner conditional can be dead/undead undertaker feeds a SAT-solver runtime can take 2-4 minutes Future work: - vampyr Q: Why not integrate this into the linux kernel tree? A: too difficult, not enough time Q: Why not integrate this into checkpatch.pl? A: not written in Perl Q:does it detect #ifdef hidden behind wrappers such as IS_ENABLED(CONFIG_***)? A: on preprocessor level yes, on C level we are working on that Q: Is it possible to use this in other projects that use Kconfig? A: yes, it's working on l4-fiasco and coreboot, busybox Q: why is there a dependency on git - can't it just be run on a plain source tree? Stefan: undertaker does not have a git dependency, but undertake-checkpatch does A: send patches or requests to us 11:50 - 12:10 Vampyr: Stefan Hengelein Configurability-aware compile-testing of source files Code is often compile-tested with only one .config Configuration coverage with Vampyr - create a minimal set of .configs that cover all of the #ifdef blocks in one file (on average 20% more Compiler calls) - apply Coccinelle, sparse, gcc, etc. Evaluated for Linux v3.2 - number of compile warnings & errors increased significantly with the Vampyr sets - found quite a few bugs * "Luckily the number of found warnings and errors is lower in Linux v3.17" MIPS v3.17 example - db120_pci_init() - but hard to know if this was intentional? - why not #error? ARM v3.17 example - arch/arm/mm/dma-mapping.c: 'atomic_pool' variable declared in an #ifdef, but used outside - also same problem with a declaration of __in_atomic_pool() Another MIPS v3.17 issue: variable only used in an #ifdef block but undeclared - so it's clear the code has never been compiled quotes Roeck: maintainers should care about make allmodconfig and make allyesconfig licensed under GPLv3 (undertaker.cs.fau.de) integration into undertaker-checkpatch is on the way Q: how many patches have been fixed so far? A: seven fix patches committed so far cross-compilers can also be used and an arbitrary number of flags can be passend to the compiler through vampyr https://etherpad.fr/p/LPC2014_DevTools 12:10 - 12:30 Discussion (pwsan)