uLive Kernel Patching Microconference Notes Welcome to Linux Plumbers Conference 2014. 1:00 Steven Rostedt: Overview of the Live Kernel Patching methods but letting other people do the talking :-) Ground rules: 10 minutes to make a case for each of the major approaches Selling their solutions: Kpatch: made for security and stability fixes and not major kernel updates 2014 released on git hub: https://github.com/dynup/kpatch already stable and useful > 95% of security fixes. works with most stock kernels (mainline) Sounds crazy! integrated with kernel. Uses ftrace to do the patching replacement functions are first class functions and compatible with oops, ftrace kprobes, etc. taint flag when used. (people do things stupid) Patching is "safe" Self contained within kernel modules. Is it safe? should not crash the kernel, defeats the purpose it is safe if you are careful with your patch analysis. How it works? build the patch module (kpatch-build foo.patch) tool to make a module. human needs to still review the patch and result to make sure. 2. Patch the kernel kpatch load kpatch-foo.ko (insmod) Building the patch module much harder than patching the kernel. Patching the kernel 4 steps. load, link, safety check, then patch it. uses ftrace, stopmachine and stack backtrace checks. ftrace hooks to the start of the function, which then calls the kpatch code and then it changes the return address from the ftrace trampoline which returns to the new function instead of the one that called it. Features patch on rollback patch on reboot ( to keep the state of the machine) multiple patches module patching (deferred). Patch it before loading user load/unload function hooks, to allow users to run code at the time the patch is made Option to skip safety check for certain functions shadow variables - can virtually add fields to structs at run time. kabi too. Limitations Human safety analysis is required. Not a general purpose upgrade tool not all CVE patches are supported stop_machine latency 1 - 40 ms. Masami Hiramatsu has proposed an alternative with no latency Currently only x86_64 Kgraft: RCU like redirection. Also uses ftrace like kpatch does. Different model to handle the safety of the process Provide a consistent world view of the tasks Uses a "reality" check trampoline to decide to call the old or new function If kernel is in the loop and kgraft is called in the loop, then you could have the old function called followed by the new, causing a crash. kgraft fixes this by making sure the old function is still called until it hits a point that it can switch to the new function Sleeping functions must wakeup to be able to cross this location where it can be using the new function. Can also skip the migration process if a human sees that it does not need this check. Advantages of kgraft Does not call stop machine. Does not generate the binaries, just pulls the code and uses gcc to build the module. Seems easier to hand code. Limitations A lot of what kpatch has designed for critical bugs data structures require some special care still needs to be reviewed Some patches may not be possible due to the RCU like changes (changing lock ordering). Get it part of SLE12 kernel tree and repository on git.kernel.org has VM image available to play with. Add one patch and can still add another patch before the first one has passed all the RCU like states to finish. CRIU & kexec: Pavel Emelyanov: "Seamless" kernel update simple case: stop tasks that need to survive dump their state as a set of image files change kernel with kexec restore the tasks from image files How (speeding up) dump state: keep memory in memory (not written into image files) keep image files in memfs collect PFNs with data getback memory with data (very quick reboot) Known problems: downtime is very senstive: can't reboot with dirty memory amount of disks reads (open) on restore can take time No reliable way to protect memory of the images and data can be corrupted Subjected to all issues that kexec is (e.g. not all HW supported) Not yet implemented for upstream kernel 1:30 Pavel Emelyanov: Updating the kernel using CRIU and KExec (Q&A) Questions: have you considered fscache? No What kind of testing have you done, with different ABIs in kernel? Don't know of any ABI changes. Kernel will not change userspace, but VDSO can be an issue becaues that can change between versions what about state of a device? checkpoint/restore should handle this Main use is containers. Freeze all containers, and containers do not talk to hardware. But this could be an issue with desktops. Have you considered writing a kernel module to reserve memory? Will talk to kernel community to save memory across kexec. Already exists today with a boot argument (kernel command line): memmap? can specify memory used in the memory. Can have filesystem loaded in this memory that is ignored, and bypasses the page cache. Relistically that wont work on servers, as they may not guarantee reserving memory on warm reboot. Have possible issues with NUMA. 1:40 Josh Poimboeuf: kpatch vs kGraft Should really be called kpatch + kgraft. Both want a standard live patching solution. Question: can we run both kpatch and kgraft in same kernel: yes! Standard format for module. No, only both use ftrace but we are trying to decide how to use each other code. Need to unify the API. That is the first step to continue. Add the "safe" patches first. The ones that do not require checks. This allows a common API for both kgraft and kpatch. No stopmachine or RCU. Can cover 90% of patches. What to do if function addresses static variables that were in both new and old functions. Symbols in multiple locations. kgraft reviews this by humans. kpatch looks for the exact symbol that is being used. Perhaps kgraft can do the same. Done by the kpatch linker to find this symbol (if not a global variable). kgraft folks are not opposed to using this too. 95% fine and requires human analysis. Is this just looking at the source code? Yes, looking for specific changes in the code, like locking order and data structures. WARN Macros embed the line number of the original source code. Analysis code can detect all these as code changed when they really didn't. kpatch added a hack to find this case. Should use the debug table for the warnings. Uses the object file to do the compare. Line number will not match original code, but the warning should also have the backtrace to show it was patched. Safety of unloading patches? We need a way to make sure patching works, thus using the same method it should be that we unpatch the same way. May need to run 6 months. Can either stack a reversion or removing the patch itself. Some people may want it, but may not be supported officially. Let incremental patching supported,but may be dangerous due to dependencies of patches at build time. Do not support incremental removal (remove a patch in the middle). Customer only uses RPM and doesn't add the patch themselves to keep them from picking and choosing patches. Using RPM dependencies too. Users can have no maintainance window. Start and shutdown every 3 years or so. This is why people are interested in live patches. Reason to reboot is usually due to accident not on purpose. crash(8) compatibility? New functions are added via modules and crash knows how to handle that. Do you share the same taint flag? probably not, but they will before it is merged. [there was a pause for a pee] 2:15 Jiri Kosina: What features are needed from a live patching solution? Also what is needed from ftrace. Audience participation is asked Safe / consistent way of function replacemen - stop_machin() for kpatch, ksplice (being removed by Masami) - lazy way via RCU way - checkpoint restart way Desired features: Remove need of having human reviewing written by hand. Easy to use API for both kpatch and kgraft where can the two methods unite need to sit down and look at each others methods use "fire and forget" the patch that is easy to use (the 95%) have a way to know what has been changed uname? taint flag? proper stack dump Revents? (already discussed) Modules: add way to update a module not loaded yet some of this is not applicable to CRIU tooling? patch to object dump. Need to add a common api, also added meta data to expand on visibility to users knowing what changed. Need to see which patches have been applied, reverted. Basically a timeline to what happened to the kernel. Should have a way to log what has changed in the kernel. Like a dmesg but not using the printk buffer. No link time optimization used. But inlined optimizations can still be used. How long to add the patch? kpatch - 1 - 40 ms; kGraft maybe several minutes but there's no impact on the running system (doesn't stop). For debug symbols, it is fine because it is still a normal kernel module. What about patching a function that happens to have a kprobe inside. Could take minutes in time that kGraft. Worry about kallsyms having the patched names. User might get confused by wanting to add a kprobe on a function that was patched, but kallsyms still has the old function without any notification about it no longer being called. What about security changes. With kGraft, is it unbound that the bug can exist in the kernel becaues they were not patched yet. Running functions will get the fix rather quickly. Sleeping functions will get patched when they wake up. Can also force tasks to wake up. Wake ups are done by signals. Continue signal gets ignored and not woken up. What about tasks that are stuck in DWAIT? There's no good answer to that. How do kernel threads get patched? Run in endless loop. kGraft patches on return to userspace (note I didn't specify that earlier). Using freezer as a point to update. Tejun is not happy with this approach. Try to convert all kernel threads to work queues. Why doesn't kGraft check the stack on sched switch? That's what kpatch does in stop machine. But there's no guarantee that there's no issues with that. Example: buggy function is called between two points that it can't be changed. If a schedule is there then there might be an issue. Example: An off by one bug, that just happened to work becaues all the buggy functions were off by one. Worry about a patch that is not simple. Recommended to let the user know if consistancy model must be run. Asked Coly Li about using this with more complex patches. Will apply it if it fixes 30% of crashes of boxes. Example: time slice overflow. Several machines crashed. This would have been a good way to fix the bug without reboot. Must also verify if the fix does work. 3:00 Tea break 3:15 Martin Pohlack: Exploring synergies between Linux Kernel and Xen hypervisor live patching Hotpatching building blocks (linux/xen) 1. preparing linux creates modules Xen? 2. Loading linux modules 3. splicing ftrace No ABI yet. Uses build IDs for xen Freeze build environment (gcc, gas) Source - patch - compile patch compare trees - changed objects and functions rebuild with -ffunction-sections extract functions add some glue code link against specific targets for xen-syms tag with build id Loading? No modules in xen. Has something similar but much simpler Activation/deactivation glue linking and relocation done in userland like old Linux does 3. splicing Function granularity add jmp instruction in old function start redirect to new code Linux is a bit more complicated. Xen has a simpler design No permanent threads. Stacks not preserved Need barrier with timeout at HV exit can abort and retry Challenges: capturing original build environments gcc & gas, but also may need to know more (eg Koji) Xen build system not made for incremental build. Need to build the world for each compile compile.h auto-generated for each build Build paths and line numbers. Has the same issues as kgraft and kpatch. Question was asked about build environment for kpatch and kgraft Compare to what level? symbol changes for static variables. Causes pains for comparisons. (e.g. fn.14077) Uses objdump to compare doesn't support exception tables. Asked how kpatch does this. There may be something special. -ffunction-sections no support in Xen, but can be compiled. _init etc. -> multiple functions in single section. inter hotpatch dependencies. Hotpatch unloading. may not be safe. for linux, it should be OK if there's not function pointers. Back traces are checked to see if functions are in use. Vojt?ch PavlĂ­k observes: with its ability to wait for all CPUs to exit the hypervisor, it has the capabilities of kGraft and kpatch. Hot patch signing? for linux it is the same for all modules. gcc dependencies? kpatch - a lot, kgraft not mutch. But they both require a compiler with -mfentry support. mailing list? For Xen specifically, xen-devel. aliguori suggests that there could be use for a cross-community live patching email list