Please contribute to the notes!

https://etherpad.fr/p/LPC2014_DevTools

Development Tools Microconference Notes

Welcome to Linux Plumbers Conference 2014.

Please use this etherpad to take notes. Microconf leaders will be giving a summary of their microconference during the Friday afternoon closing session.
Please remember there is no video this year, so your notes are the only record of your microconference.

9:30 - 9:50
Coccinelle: A program matching and transformation tool: Himangi Saraogi
http://coccinelle.lip6.fr/

- Computational Tree Logic
- Collateral evolution

Q: Does it generate commit messages?
Julia: there is a tool inside Coccinelle that will look for changes that share the same maintainer - splitpatch
	- it runs 'git log -- oneline'
	
Q: Any plans on improving the error messages?
Julia: I'm not very motivated to change the parser generator, and that is what generated the error message that Himangi cited

Q: Is Coccinelle usable for static analysis?
Julia: Coccinelle just matches patterns against code
	- It does not do any alias analysis
	- the niche of Coccinelle is that it is very easy to write your own rules
	- "Consider paying Coverity"

Q: How to check only new changes?
Julia: Coccinelle has an option that can check uncommitted changes
	- In the future we may add the ability to check patches
	- In the future there might be a way to automatically detect which rules apply to a particular patch

Q: Can this be used to transform C code into something else?
Julia: It won't work

Q: Is there a tool that converts a C patch into a semantic patch because SmPL is too hard?
Julia: There is a tool called "spdiff" that can do that for simple cases
	- spdiff takes a bunch of patches and generates a semantic patch


9:50 - 10:10
Backporting the kernel with SmPL: Luis Rodriguez

backports.git on k.org
irc.freenode.net #kernel-backports

- "well over 800 drivers"
- compromises:
    - DRM drivers dropped
    - regulator framework dropped
    - "only carry things folks need"
    - only support kernels >= 3.0
    - this is for upstream drivers, not proprietary drivers

backports project uses Coccinelle daily
- unusual
- "stressing Coccinelle"

Formerly, folks would add kernel version ifdefs while backporting
- results in long patches
	- hard to review them

Instead, add helper functions that encapsulate the #ifdefs
- one patch for one single upstream commit
- track linux-next daily
? can it be done automatically?  -> Coccinelle

With Coccinelle, one does not have to deal with hunk context at all
- How about hard stuff?
  - threaded IRQs?
  - added in 2.6.31
  - added struct compat_threaded_irq
  - extended the data structure
  - added compat_request_threaded_irq()

Pay attention to the private argument
- "some kind of pointer"
	- private for each device driver
- 'private' data structure type was inferred by Coccinelle
- upstream static inlines would also have helped
	- use static inline functions for structure accessors

How can one be 100% certain that this transformation is correct?
- verify against the original hand-created backport

Coccinelle revealed inconsistencies in the original backport

Future work:
    - linux-next conflicts required manual fixing
    - once every two weeks - manual intervention required
	    - can this be automated?

Solicitation for funding for this project at InRIA

Q: Has this been used for backporting only drivers, or also subsystems?
A: Both

"Atomic patches" = a patch for a single collateral evolution

Q: What is the strategy of backporting?
A: Linux kernel developers should not have to do backporting
	- it should be done automatically

Q: Why are backports necessary?
A: "a lot of people can't use the latest kernel"

Q: Backports create a package that you compile against any kernel
A: yes


10:10 - 10:35
How to fix bugs automatically? Martin Monperrus

Automatic software repair

Software has a lot of bugs
We spend a lot of time fixing them
How to detect them?
How to generate a patch to fix them automatically?

There are simple bugs
Is it possible to devise algorithms that take a description of the bug and generate a proposed patch?
Perhaps generate several different patches and allow the developers to choose between them?

(speaker drawing comparisons to "le merveilleux" recipes)

Choose a class of bugs
- Buffer overflow? Unhandled exception? Infinite loops? Crashes? Memory leaks? API misusage? Buggy conditions?
	* focusing on buggy conditions, missing preconditions
	* students focusing on infinite loops and exceptions

Identify a good bug oracle
- A bug oracle tells you "yes there is a bug" or "no it is fixed"
	- examples: crashing input, failing test cases (Java)
	- enables trial-and-error automatic generation of attempted fixes
	
A good bug oracle is automated and does not take too long to run

Set up repair operators
- "a generic modification on source code"
  - e.g., adding preconditions

"Some repair operators are surgical, others are very violent"
- "In some cases one needs a regression oracle"
- a strong test suite is needed

Q: why is the bug oracle separate from the regression oracle?  Why not add your bug oracle to the set of regression oracles?
A: to maintain performance during the automatic fix generation phase

CPU usage
- 10 minutes in one example, one hour in another

Q: testing .NET apps is different than testing the kernel - one may introduce a regression that is not covered by the existing test suite
A: agrees

Some bug oracles can be embedded as runtime assertions
But in most cases the intention is to make recommendations to developers

Does it work?
- "only the beginning"
- "~100 real bugs have been fixed automatically"

Q: How many of the ~100 bugs have been accepted by the maintainers?
A: Well all of our development has been on existing bugs with existing fixes, to try to verify our approach

Q: what licenses do your team use?
A: mostly GPL but it depends

List of publicly available tools: http://www.monperrus.net/martin/automatic-software-repair-tools

10:40 - 11:05
GCC and LLVM collaboration review: Renato Golin

Had a meeting between the GCC and LLVM teams
"Had a bit of a clash"
Disagreement on licenses, agreement on technical issues

Problems:
	- gcc behaves in one way, clang behaves in another

Would like to standardize public interfaces

Much is shared: languages, assembly language,  many command line options, libraries, binutils, LTO, PGO
Common projects: gas, ld, glic/newlib, sanitizers

What about:
    - a common user interface?
    - better documentation of  extensions?
	    - inline assembly?
    - avoiding unintended side effects that surprise users?

Driver: common user interface
- Command line flags, mostly
- Target descriptions can have subtle differences (e.g., arm-linux-gnueabihf-gcc)
- Return values

Behan: clang is a single compiler that targets multiple architectures, but gcc binaries are built separately for each target

Extension standards
- Inline assembler, attribute, language extensions, compiler extensions
- LLVM will never implement VLAIS
- LLVM will never implement nested functions

Geting involved:
http://clang.llvm.org/get_involved.html

"gcc is not the C standard"
- consider compiling your code with both gcc and clang
- one compiler may have different warnings than the other


11:05 - 11:30
Testing Multiple Architectures Using QEMU: Christopher Covington

testing with QEMU (qemu-system-$arch) cannot replace real machines
but testing with QEMU should replace "no testing at all"
existing users: Aboriginal linux, strace, virtme (https://github.com/amluto/virtme)
case study: CRIU
building: use native toolchain or cross-compiing, out-of-tree build support is fine but not necessary
run-time deps: root fs can be initramfs, image (genext2fs recommended), network/passthrough (works for >5G)

- virtio/9pfs
  - needed to work around problems related to image size (> 5 GB?)

Example QEMU command lines: http://www.spinics.net/lists/linux-virtualization/msg23031.html

- speed: optimize CFLAGS, run only what needed, static better than dynamic
  - but presenter has not measured yet

broonie: others are looking at the same approach (Grant Likely/Shuah Khan?)

luisr: also does the same thing, but not across separate architectures - attempt is to use this "to isolate oopses"

Does linux-next uses emulation/virtualization for for testing?

How to deal with the rootfs?  prepackaged or build from scratch?
- Support both?

Take a checkpoint of a booted-up system that's about to load a testcase

http://www.quicinc.com/

11:30 - 11:50
Undertaker: Valentin Rothberg

undertaker.cs.fau.de

How to avoid #ifdef bugs in the Linux kernel

kernel/smp.c: #ifdef CONFIG_CPU_HOTPLUG example
 - leaks memory!
	* CONFIG_CPU_HOTPLUG does not exist!
	* CONFIG_HOTPLUG_CPU is the right option

Undefined CPP identifiers evaluate to false
- can lead to dead/undead #ifdef blocks

Kconfig identifiers have the same problem
e.g. dependencies on undefined identifiers

How big is the problem?
- on average, 600-700 dead/undead blocks!
- 9k-10k dead/undead SLOC 
- at least the problem is getting smaller per kernel version...

How to avoid this?
- new tool: undertaker-checkpatch
	- checks git commits for #ifdef bugs
	- can be used like checkpatch.pl
	- can detect if bugs are added or repaired

Q: can this be run on a source tree like checkpatch.pl, or does it require a git tree?
A: requires a git tree

mach-omap2/board-h4.c example

Kconfig changes are critical
- can find cases where Kconfig lines are removed, but references persist in the tree

Can also find logical constraints
- nested defines where the inner #ifdef mandates a certain evaluation of the outer #ifdef
  - so an #else arm of the inner conditional can be dead/undead

undertaker feeds a SAT-solver

runtime can take 2-4 minutes

Future work: 
    - vampyr

Q: Why not integrate this into the linux kernel tree?
A: too difficult, not enough time

Q: Why not integrate this into checkpatch.pl? 
A: not written in Perl

Q:does it detect #ifdef hidden behind  wrappers such as IS_ENABLED(CONFIG_***)?
A: on preprocessor level yes, on C level we are working on that
    
Q: Is it possible to use this in other projects that use Kconfig?
A: yes, it's working on l4-fiasco and coreboot, busybox

Q: why is there a dependency on git - can't it just be run on a plain source tree?
Stefan: undertaker does not have a git dependency, but undertake-checkpatch does
A: send patches or requests to us


11:50 - 12:10
Vampyr: Stefan Hengelein

Configurability-aware compile-testing of source files

Code is often compile-tested with only one .config

Configuration coverage with Vampyr
- create a minimal set of .configs that cover all of the #ifdef blocks in one file (on average 20% more Compiler calls)
- apply Coccinelle, sparse, gcc, etc.

Evaluated for Linux v3.2
- number of compile warnings & errors increased significantly with the Vampyr sets
- found quite a few bugs
* "Luckily the number of found warnings and errors is lower in Linux v3.17"

MIPS v3.17 example - db120_pci_init()
- but hard to know if this was intentional?
	- why not #error?
	
ARM v3.17 example - arch/arm/mm/dma-mapping.c: 'atomic_pool' variable declared in an #ifdef, but used outside
- also same problem with a declaration of __in_atomic_pool()

Another MIPS v3.17 issue: variable only used in an #ifdef block but undeclared
- so it's clear the code has never been compiled

quotes Roeck: maintainers should care about make allmodconfig and make allyesconfig

licensed under GPLv3 (undertaker.cs.fau.de)
integration into undertaker-checkpatch is on the way

Q: how many patches have been fixed so far?
A: seven fix patches committed so far

cross-compilers can also be used and an arbitrary number of flags can be passend to the compiler through vampyr


https://etherpad.fr/p/LPC2014_DevTools

12:10 - 12:30
Discussion


(pwsan)