Date
1 - 20 of 22
[RFC PATCH 0/4] Add support for LZ4-compressed kernels
Andrew Morton
On Sat, 26 Jan 2013 14:50:43 +0900
Kyungsik Lee <kyungsik.lee@...> wrote: This patchset is for supporting LZ4 compressed kernel and initial ramdisk onWhat's this "with enabled unaligned memory access" thing? You mean "if the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"? If so, that's only x86, which isn't really in the target market for this patch, yes? It's a lot of code for a 50ms boot-time improvement. Does anyone have any opinions on whether or not the benefits are worth the cost? |
|
kyungsik.lee <kyungsik.lee@...>
On 2013-01-29 오전 7:25, Andrew Morton wrote:
On Sat, 26 Jan 2013 14:50:43 +0900Yes, exactly. If the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, then it is expected more boot-time improvement by LZ4-decompressor. Currently there are two architectures which support it in mainline; x86 and powerpc. And it is expected that ARM arch(v6 or above) also support it since the commit below. Commit ID: 5010192d5 ARM: 7583/1: decompressor: Enable unaligned memory access for v6 and above by Dave Martin The test results(167ms) come from the ARM(v7 arch), MSM8960 based board with CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS set. Not only for the kernel but also the ramdisk can be compressed with LZ4 so the boot-time would be more improved. The test case above didn't include the decompressing time result for LZ4-compressed ramdisk. So far the implementation is applicable to boot-time improvement for LZ4-compressed kernel and ramdisk images but the decompressor module is exported as an interface for other usages like LZO. With LZ4 compressor(not yet implemented for the kernel), it is expected that it will be used in many places in kernel such as crypto and fs(squashfs, btrfs). Thanks, Kyungsik |
|
Nicolas Pitre <nico@...>
On Mon, 28 Jan 2013, Andrew Morton wrote:
On Sat, 26 Jan 2013 14:50:43 +0900I'm guessing this is referring to commit 5010192d5a. It's a lot of code for a 50ms boot-time improvement. Does anyone haveWell, we used to have only one compressed format. Now we have nearly half a dozen, with the same worthiness issue between themselves. Either we keep it very simple, or we make it very flexible. The former would argue in favor of removing some of the existing formats, the later would let this new format in. Nicolas |
|
H. Peter Anvin <hpa@...>
Uhm... you're saying we have to be at one extreme or the other?
toggle quoted message
Show quoted text
We probably could drop the legacy lzma format, but someone might rely on it. Nicolas Pitre <nico@...> wrote: On Mon, 28 Jan 2013, Andrew Morton wrote:On Sat, 26 Jan 2013 14:50:43 +0900ramdisk on --
Sent from my mobile phone. Please excuse brevity and lack of formatting. |
|
Richard Cochran <richardcochran@...>
On Mon, Jan 28, 2013 at 02:25:10PM -0800, Andrew Morton wrote:
In the embedded space, quick boot is a really important feature to have. Many people resort to awful hacks in order to improve boot time, and so I would welcome this option. I have seen arm systems that boot in 300 ms. I would say that 50 ms is maybe not such a small improvement after all. Thanks, Richard |
|
Russell King - ARM Linux <linux@...>
On Mon, Jan 28, 2013 at 02:25:10PM -0800, Andrew Morton wrote:
What's this "with enabled unaligned memory access" thing? You mean "ifWell... when I saw this my immediate reaction was "oh no, yet another decompressor for the kernel". We have five of these things already. Do we really need a sixth? My feeling is that we should have: - one decompressor which is the fastest - one decompressor for the highest compression ratio - one popular decompressor (eg conventional gzip) And if we have a replacement one for one of these, then it should do exactly that: replace it. I realise that various architectures will behave differently, so we should really be looking at numbers across several arches. Otherwise, where do we stop adding new ones? After we have 6 of these (which is after this one). After 12? After the 20th? |
|
Egon Alter <egon.alter@...>
Am Dienstag, 29. Januar 2013, 10:15:49 schrieb Russell King - ARM Linux:
On Mon, Jan 28, 2013 at 02:25:10PM -0800, Andrew Morton wrote:the problem gets more complicated as the "fastest" decompressor usuallyWhat's this "with enabled unaligned memory access" thing? You mean "ifWell... when I saw this my immediate reaction was "oh no, yet another creates larger images which need more time to load from the storage, e.g. a one MB larger image on a 10 MB/s storage (note: bootloaders often configure the storage controllers in slow modes) gives 100 ms more boot time, thus eating the gain of a "fast decompressor". Egon |
|
Russell King - ARM Linux <linux@...>
On Tue, Jan 29, 2013 at 12:43:20PM +0100, Egon Alter wrote:
Am Dienstag, 29. Januar 2013, 10:15:49 schrieb Russell King - ARM Linux:Ok.On Mon, Jan 28, 2013 at 02:25:10PM -0800, Andrew Morton wrote:the problem gets more complicated as the "fastest" decompressor usuallyWhat's this "with enabled unaligned memory access" thing? You mean "ifWell... when I saw this my immediate reaction was "oh no, yet another We already have: - lzma: 33% smaller than gzip, decompression speed between gzip and bzip2 - xz: 30% smaller than gzip, decompression speed similar to lzma - bzip2: 10% smaller than gzip, slowest decompression - gzip: reference implementation - lzo: 10% bigger than gzip, fastest And now: - lz4: 8% bigger than lzo, 16% faster than lzo? (I make that 16% bigger than gzip) So, image size wise, on a 2MB compressed gzip image, we're looking at the difference between LZO at 2.2MB and LZ4 at 2.38MB. But let's not stop there - the figures given for a 13MB decompressed image were: lzo: 6.3MB, 301ms lz4: 6.8MB, 251ms(167ms, with enabled unaligned memory access) At 10MB/s (your figure), it takes .68s to read 6.8MB as opposed to .63s for LZO. So, totalling up these figures gives to give the overall figure: lzo: 301ms + 630ms = 931ms lz4: 167ms + 680ms = 797ms Which gives the tradeoff at 10MB/s of 14% faster (but only with efficient unaligned memory access.) So... this faster decompressor is still the fastest even with your media transfer rate factored in. That gives an argument for replacing lzo with lz4... |
|
H. Peter Anvin <hpa@...>
On 01/29/2013 02:15 AM, Russell King - ARM Linux wrote:
On Mon, Jan 28, 2013 at 02:25:10PM -0800, Andrew Morton wrote:The only concern I have with that is if someone paints themselves into a corner and absolutely wants, say, LZO.What's this "with enabled unaligned memory access" thing? You mean "ifWell... when I saw this my immediate reaction was "oh no, yet another Otherwise, per your list it pretty much sounds like we should have lz4, gzip, and xz. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. |
|
Johannes Stezenbach <js@...>
On Mon, Jan 28, 2013 at 11:29:14PM -0500, Nicolas Pitre wrote:
On Mon, 28 Jan 2013, Andrew Morton wrote:This reminded me to check the status of the lzo update and itOn Sat, 26 Jan 2013 14:50:43 +0900I'm guessing this is referring to commit 5010192d5a. seems it got lost? http://lkml.org/lkml/2012/10/3/144 (Cc: added, I hope Markus still cares and someone could eventually take his patch once he resends it.) Johannes |
|
Nicolas Pitre <nico@...>
On Tue, 29 Jan 2013, H. Peter Anvin wrote:
On 01/29/2013 02:15 AM, Russell King - ARM Linux wrote:That would be hard to justify given that the kernel provides its ownOn Mon, Jan 28, 2013 at 02:25:10PM -0800, Andrew Morton wrote:The only concern I have with that is if someone paints themselves into aWhat's this "with enabled unaligned memory access" thing? You mean "ifWell... when I saw this my immediate reaction was "oh no, yet another decompressor code, making the compression format transparent to bootloaders, etc. And no one should be poking into the compressed zImage. Otherwise, per your list it pretty much sounds like we should have lz4, gzip,I do agree with that. Nicolas |
|
H. Peter Anvin <hpa@...>
On 01/30/2013 10:33 AM, Nicolas Pitre wrote:
Some utterly weird things like the Xen domain builder do that, becauseThat would be hard to justify given that the kernel provides its own they have to. That is why we explicitly document that the payload is ELF and how to access it in the bzImage spec. -hpa |
|
Nicolas Pitre <nico@...>
On Thu, 31 Jan 2013, H. Peter Anvin wrote:
On 01/30/2013 10:33 AM, Nicolas Pitre wrote:Are you kidding?Some utterly weird things like the Xen domain builder do that, becauseThat would be hard to justify given that the kernel provides its own And what format do they expect? If people are doing weird things with formats we're about to remove then it's their fault if they didn't make upstream developers aware of it. And if the reason they didn't tell anyone is because it is too nasty for public confession then they simply deserve to be broken and come up with a more sustainable solution. Nicolas |
|
H. Peter Anvin <hpa@...>
On 01/31/2013 02:16 PM, Nicolas Pitre wrote:
I think they can be fairly flexible. Obviously gzip is alwaysAre you kidding? supported. I don't know the details. If people are doing weird things with formats we're about to remove thenWell, it is too nasty for public confession, but it's called "paravirtualization". -hpa |
|
Nicolas Pitre <nico@...>
On Thu, 31 Jan 2013, H. Peter Anvin wrote:
On 01/31/2013 02:16 PM, Nicolas Pitre wrote:The fact that you are aware of it means we're not going to break them.I think they can be fairly flexible. Obviously gzip is alwaysAre you kidding? But my point is that we must not be held back just in case someone out there might have painted himself in a corner without telling anyone. Nicolas |
|
H. Peter Anvin <hpa@...>
On 01/31/2013 06:28 PM, Nicolas Pitre wrote:
Yes. However, it makes it more questionable to simply rip outThe fact that you are aware of it means we're not going to break them. compression methods without warning. Not that warnings help, as we have learned. -hpa |
|
kyungsik.lee <kyungsik.lee@...>
On 2013-01-30 오전 6:09, Rajesh Pawar wrote:
I didn't test with the proposed LZO update you mentioned. Sorry, which one do you mean?Andrew Morton <akpm@...> wrote:BTW, what happened to the proposed LZO update - woudn't it better to merge this first? I did some tests with the latest LZO in the mainline. As a result, LZO is not faster in an unaligned access enabled on ARM. Actually Slower. Decompression time: 336ms(383ms, with unaligned access enabled) You may refer to https://lkml.org/lkml/2012/10/7/85 to know more about it. Thanks, Kyungsik Thanks, Kyungsik |
|
kyungsik.lee <kyungsik.lee@...>
On 2013-01-29 오후 8:43, Egon Alter wrote:
Am Dienstag, 29. Januar 2013, 10:15:49 schrieb Russell King - ARM Linux:Yes, the larger image could matter. Definitely it takes longer.On Mon, Jan 28, 2013 at 02:25:10PM -0800, Andrew Morton wrote:the problem gets more complicated as the "fastest" decompressor usuallyWhat's this "with enabled unaligned memory access" thing? You mean "ifWell... when I saw this my immediate reaction was "oh no, yet another Here are some updated test cases: Including "loading time" lzo lz4 loading time: 480ms 510ms decompression time: 336ms 180ms(with efficient unaligned memory access enabled and ARM optimization) total time: 816ms 690ms lz4 is still 15% faster in total time. This one is similar to the simulated result by Russell King. Thanks, Kyungsik |
|
Markus F.X.J. Oberhumer <markus@...>
On 2013-02-01 08:00, kyungsik.lee wrote:
On 2013-01-30 오전 6:09, Rajesh Pawar wrote:In fact you can easily improve LZO decompression speed on armv7 by almost 50%I didn't test with the proposed LZO update you mentioned. Sorry, which one doAndrew Morton <akpm@...> wrote:BTW, what happened to the proposed LZO update - woudn't it better to merge by adding just a few lines for enabling unaligend access: armv7 (Cortex-A9), Linaro gcc-4.6 -O3, Silesia test corpus, 256 kB block-size: compression speed decompression speed LZO-2005 : 27 MB/sec 84 MB/sec LZO-2012 : 44 MB/sec 117 MB/sec LZO-2013-UA : 47 MB/sec 167 MB/sec Please see my other mail to LKML for details. Cheers, Markus As a result, LZO is not faster in an unaligned access enabled on ARM. Actually-- Markus Oberhumer, <markus@...>, http://www.oberhumer.com/ |
|
Markus F.X.J. Oberhumer <markus@...>
On 2013-01-30 11:23, Johannes Stezenbach wrote:
On Mon, Jan 28, 2013 at 11:29:14PM -0500, Nicolas Pitre wrote:The proposed LZO update currently lives in the linux-next tree.On Mon, 28 Jan 2013, Andrew Morton wrote:This reminded me to check the status of the lzo update and itOn Sat, 26 Jan 2013 14:50:43 +0900I'm guessing this is referring to commit 5010192d5a. I had tried several times during the last 12 months to provide an update of the kernel LZO version, but community interest seemed low and I basically got no feedback about performance improvements - which made we wonder if people actually care. At least akpm did approve the LZO update for inclusion into 3.7, but the code still has not been merged into the main tree. > On 2012-10-09 21:26, Andrew Morton wrote: > [...] > The changes look OK to me. Please ask Stephen to include the tree in > linux-next, for a 3.7 merge. Well, this probably means I have done a rather poor marketing. Anyway, as people seem to love *synthetic* benchmarks I'm finally posting some timings (including a brand new ARM unaligned version - this is just a quick hack which probably still can get optimized further). Hopefully publishing these numbers will help arousing more interest. :-) Cheers, Markus x86_64 (Sandy Bridge), gcc-4.6 -O3, Silesia test corpus, 256 kB block-size: compression speed decompression speed LZO-2005 : 150 MB/sec 468 MB/sec LZO-2012 : 434 MB/sec 1210 MB/sec i386 (Sandy Bridge), gcc-4.6 -O3, Silesia test corpus, 256 kB block-size: compression speed decompression speed LZO-2005 : 143 MB/sec 409 MB/sec LZO-2012 : 372 MB/sec 1121 MB/sec armv7 (Cortex-A9), Linaro gcc-4.6 -O3, Silesia test corpus, 256 kB block-size: compression speed decompression speed LZO-2005 : 27 MB/sec 84 MB/sec LZO-2012 : 44 MB/sec 117 MB/sec LZO-2013-UA : 47 MB/sec 167 MB/sec Legend: LZO-2005 : LZO version in current 3.8 rc6 kernel (which is based on the LZO 2.02 release from 2005) LZO-2012 : updated LZO version available in linux-next LZO-2013-UA : updated LZO version available in linux-next plus ARM Unaligned Access patch (attached below) (Cc: added, I hope Markus still cares and someone could-- Markus Oberhumer, <markus@...>, http://www.oberhumer.com/ |
|