Date   

Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels

kyungsik.lee <kyungsik.lee@...>
 

On 2013-01-30 오전 6:09, Rajesh Pawar wrote:
Andrew Morton <akpm@...> wrote:

On Sat, 26 Jan 2013 14:50:43 +0900
Kyungsik Lee <kyungsik.lee@...> wrote:
This patchset is for supporting LZ4 compressed kernel and initial ramdisk on
the x86 and ARM architectures.

According to [[http://code.google.com/p/lz4/,]] LZ4 is a very fast lossless
compression algorithm and also features an extremely fast decoder.

Kernel Decompression APIs are based on implementation by Yann Collet
([[http://code.google.com/p/lz4/source/checkout]]).
De/compression Tools are also provided from the site above.

The initial test result on ARM(v7) based board shows that the size of kernel
with LZ4 compressed is 8% bigger than LZO compressed but the decompressing
speed is faster(especially under the enabled unaligned memory access).

Test: 3.4 based kernel built with many modules
Uncompressed kernel size: 13MB
lzo: 6.3MB, 301ms
lz4: 6.8MB, 251ms(167ms, with enabled unaligned memory access)

It seems that it___s worth trying LZ4 compressed kernel image or ramdisk
for making the kernel boot more faster.

...

20 files changed, 663 insertions(+), 3 deletions(-)

...
What's this "with enabled unaligned memory access" thing? You mean "if
the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"? If so,
that's only x86, which isn't really in the target market for this
patch, yes?
It's a lot of code for a 50ms boot-time improvement. Does anyone have
any opinions on whether or not the benefits are worth the cost?
BTW, what happened to the proposed LZO update - woudn't it better to merge this first?

Also, under the hood LZ4 seems to be quite similar to LZO, so probably
LZO speed would also greatly benefit from unaligned access and some other
ARM optimisations
I didn't test with the proposed LZO update you mentioned. Sorry, which one do you mean?
I did some tests with the latest LZO in the mainline.

As a result, LZO is not faster in an unaligned access enabled on ARM. Actually Slower.

Decompression time: 336ms(383ms, with unaligned access enabled)

You may refer to https://lkml.org/lkml/2012/10/7/85 to know more about it.

Thanks,
Kyungsik


Thanks,
Kyungsik


Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels

H. Peter Anvin <hpa@...>
 

On 01/31/2013 06:28 PM, Nicolas Pitre wrote:

Well, it is too nasty for public confession, but it's called
"paravirtualization".
The fact that you are aware of it means we're not going to break them.

But my point is that we must not be held back just in case someone out
there might have painted himself in a corner without telling anyone.
Yes. However, it makes it more questionable to simply rip out
compression methods without warning. Not that warnings help, as we have
learned.

-hpa


Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels

Nicolas Pitre <nico@...>
 

On Thu, 31 Jan 2013, H. Peter Anvin wrote:

On 01/31/2013 02:16 PM, Nicolas Pitre wrote:

Some utterly weird things like the Xen domain builder do that, because
they have to. That is why we explicitly document that the payload is
ELF and how to access it in the bzImage spec.
Are you kidding?

And what format do they expect?
I think they can be fairly flexible. Obviously gzip is always
supported. I don't know the details.

If people are doing weird things with formats we're about to remove then
it's their fault if they didn't make upstream developers aware of it.
And if the reason they didn't tell anyone is because it is too nasty for
public confession then they simply deserve to be broken and come up with
a more sustainable solution.
Well, it is too nasty for public confession, but it's called
"paravirtualization".
The fact that you are aware of it means we're not going to break them.

But my point is that we must not be held back just in case someone out
there might have painted himself in a corner without telling anyone.


Nicolas


Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels

H. Peter Anvin <hpa@...>
 

On 01/31/2013 02:16 PM, Nicolas Pitre wrote:

Some utterly weird things like the Xen domain builder do that, because
they have to. That is why we explicitly document that the payload is
ELF and how to access it in the bzImage spec.
Are you kidding?

And what format do they expect?
I think they can be fairly flexible. Obviously gzip is always
supported. I don't know the details.

If people are doing weird things with formats we're about to remove then
it's their fault if they didn't make upstream developers aware of it.
And if the reason they didn't tell anyone is because it is too nasty for
public confession then they simply deserve to be broken and come up with
a more sustainable solution.
Well, it is too nasty for public confession, but it's called
"paravirtualization".

-hpa


Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels

Nicolas Pitre <nico@...>
 

On Thu, 31 Jan 2013, H. Peter Anvin wrote:

On 01/30/2013 10:33 AM, Nicolas Pitre wrote:

The only concern I have with that is if someone paints themselves into a
corner and absolutely wants, say, LZO.
That would be hard to justify given that the kernel provides its own
decompressor code, making the compression format transparent to
bootloaders, etc. And no one should be poking into the compressed
zImage.
Some utterly weird things like the Xen domain builder do that, because
they have to. That is why we explicitly document that the payload is
ELF and how to access it in the bzImage spec.
Are you kidding?

And what format do they expect?

If people are doing weird things with formats we're about to remove then
it's their fault if they didn't make upstream developers aware of it.
And if the reason they didn't tell anyone is because it is too nasty for
public confession then they simply deserve to be broken and come up with
a more sustainable solution.


Nicolas


Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels

H. Peter Anvin <hpa@...>
 

On 01/30/2013 10:33 AM, Nicolas Pitre wrote:

The only concern I have with that is if someone paints themselves into a
corner and absolutely wants, say, LZO.
That would be hard to justify given that the kernel provides its own
decompressor code, making the compression format transparent to
bootloaders, etc. And no one should be poking into the compressed
zImage.
Some utterly weird things like the Xen domain builder do that, because
they have to. That is why we explicitly document that the payload is
ELF and how to access it in the bzImage spec.

-hpa


Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels

Nicolas Pitre <nico@...>
 

On Tue, 29 Jan 2013, H. Peter Anvin wrote:

On 01/29/2013 02:15 AM, Russell King - ARM Linux wrote:
On Mon, Jan 28, 2013 at 02:25:10PM -0800, Andrew Morton wrote:
What's this "with enabled unaligned memory access" thing? You mean "if
the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"? If so,
that's only x86, which isn't really in the target market for this
patch, yes?

It's a lot of code for a 50ms boot-time improvement. Does anyone have
any opinions on whether or not the benefits are worth the cost?
Well... when I saw this my immediate reaction was "oh no, yet another
decompressor for the kernel". We have five of these things already.
Do we really need a sixth?

My feeling is that we should have:
- one decompressor which is the fastest
- one decompressor for the highest compression ratio
- one popular decompressor (eg conventional gzip)

And if we have a replacement one for one of these, then it should do
exactly that: replace it. I realise that various architectures will
behave differently, so we should really be looking at numbers across
several arches.

Otherwise, where do we stop adding new ones? After we have 6 of these
(which is after this one). After 12? After the 20th?
The only concern I have with that is if someone paints themselves into a
corner and absolutely wants, say, LZO.
That would be hard to justify given that the kernel provides its own
decompressor code, making the compression format transparent to
bootloaders, etc. And no one should be poking into the compressed
zImage.

Otherwise, per your list it pretty much sounds like we should have lz4, gzip,
and xz.
I do agree with that.


Nicolas


Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels

Johannes Stezenbach <js@...>
 

On Mon, Jan 28, 2013 at 11:29:14PM -0500, Nicolas Pitre wrote:
On Mon, 28 Jan 2013, Andrew Morton wrote:

On Sat, 26 Jan 2013 14:50:43 +0900
Kyungsik Lee <kyungsik.lee@...> wrote:

This patchset is for supporting LZ4 compressed kernel and initial ramdisk on
the x86 and ARM architectures.

According to http://code.google.com/p/lz4/, LZ4 is a very fast lossless
compression algorithm and also features an extremely fast decoder.

Kernel Decompression APIs are based on implementation by Yann Collet
(http://code.google.com/p/lz4/source/checkout).
De/compression Tools are also provided from the site above.

The initial test result on ARM(v7) based board shows that the size of kernel
with LZ4 compressed is 8% bigger than LZO compressed but the decompressing
speed is faster(especially under the enabled unaligned memory access).

Test: 3.4 based kernel built with many modules
Uncompressed kernel size: 13MB
lzo: 6.3MB, 301ms
lz4: 6.8MB, 251ms(167ms, with enabled unaligned memory access)
What's this "with enabled unaligned memory access" thing? You mean "if
the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"? If so,
that's only x86, which isn't really in the target market for this
patch, yes?
I'm guessing this is referring to commit 5010192d5a.

It's a lot of code for a 50ms boot-time improvement. Does anyone have
any opinions on whether or not the benefits are worth the cost?
Well, we used to have only one compressed format. Now we have nearly
half a dozen, with the same worthiness issue between themselves.
Either we keep it very simple, or we make it very flexible. The former
would argue in favor of removing some of the existing formats, the later
would let this new format in.
This reminded me to check the status of the lzo update and it
seems it got lost?
http://lkml.org/lkml/2012/10/3/144

(Cc: added, I hope Markus still cares and someone could
eventually take his patch once he resends it.)

Johannes


Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels

H. Peter Anvin <hpa@...>
 

On 01/29/2013 02:15 AM, Russell King - ARM Linux wrote:
On Mon, Jan 28, 2013 at 02:25:10PM -0800, Andrew Morton wrote:
What's this "with enabled unaligned memory access" thing? You mean "if
the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"? If so,
that's only x86, which isn't really in the target market for this
patch, yes?

It's a lot of code for a 50ms boot-time improvement. Does anyone have
any opinions on whether or not the benefits are worth the cost?
Well... when I saw this my immediate reaction was "oh no, yet another
decompressor for the kernel". We have five of these things already.
Do we really need a sixth?

My feeling is that we should have:
- one decompressor which is the fastest
- one decompressor for the highest compression ratio
- one popular decompressor (eg conventional gzip)

And if we have a replacement one for one of these, then it should do
exactly that: replace it. I realise that various architectures will
behave differently, so we should really be looking at numbers across
several arches.

Otherwise, where do we stop adding new ones? After we have 6 of these
(which is after this one). After 12? After the 20th?
The only concern I have with that is if someone paints themselves into a corner and absolutely wants, say, LZO.

Otherwise, per your list it pretty much sounds like we should have lz4, gzip, and xz.

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.


Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels

Russell King - ARM Linux <linux@...>
 

On Tue, Jan 29, 2013 at 12:43:20PM +0100, Egon Alter wrote:
Am Dienstag, 29. Januar 2013, 10:15:49 schrieb Russell King - ARM Linux:
On Mon, Jan 28, 2013 at 02:25:10PM -0800, Andrew Morton wrote:
What's this "with enabled unaligned memory access" thing? You mean "if
the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"? If so,
that's only x86, which isn't really in the target market for this
patch, yes?

It's a lot of code for a 50ms boot-time improvement. Does anyone have
any opinions on whether or not the benefits are worth the cost?
Well... when I saw this my immediate reaction was "oh no, yet another
decompressor for the kernel". We have five of these things already.
Do we really need a sixth?

My feeling is that we should have:
- one decompressor which is the fastest
- one decompressor for the highest compression ratio
- one popular decompressor (eg conventional gzip)
the problem gets more complicated as the "fastest" decompressor usually
creates larger images which need more time to load from the storage, e.g. a
one MB larger image on a 10 MB/s storage (note: bootloaders often configure
the storage controllers in slow modes) gives 100 ms more boot time, thus
eating the gain of a "fast decompressor".
Ok.

We already have:

- lzma: 33% smaller than gzip, decompression speed between gzip and bzip2
- xz: 30% smaller than gzip, decompression speed similar to lzma
- bzip2: 10% smaller than gzip, slowest decompression
- gzip: reference implementation
- lzo: 10% bigger than gzip, fastest

And now:

- lz4: 8% bigger than lzo, 16% faster than lzo?
(I make that 16% bigger than gzip)

So, image size wise, on a 2MB compressed gzip image, we're looking at
the difference between LZO at 2.2MB and LZ4 at 2.38MB.

But let's not stop there - the figures given for a 13MB decompressed
image were:

lzo: 6.3MB, 301ms
lz4: 6.8MB, 251ms(167ms, with enabled unaligned memory access)

At 10MB/s (your figure), it takes .68s to read 6.8MB as opposed to .63s
for LZO. So, totalling up these figures gives to give the overall figure:

lzo: 301ms + 630ms = 931ms
lz4: 167ms + 680ms = 797ms

Which gives the tradeoff at 10MB/s of 14% faster (but only with efficient
unaligned memory access.) So... this faster decompressor is still the
fastest even with your media transfer rate factored in.

That gives an argument for replacing lzo with lz4...


Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels

Egon Alter <egon.alter@...>
 

Am Dienstag, 29. Januar 2013, 10:15:49 schrieb Russell King - ARM Linux:
On Mon, Jan 28, 2013 at 02:25:10PM -0800, Andrew Morton wrote:
What's this "with enabled unaligned memory access" thing? You mean "if
the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"? If so,
that's only x86, which isn't really in the target market for this
patch, yes?

It's a lot of code for a 50ms boot-time improvement. Does anyone have
any opinions on whether or not the benefits are worth the cost?
Well... when I saw this my immediate reaction was "oh no, yet another
decompressor for the kernel". We have five of these things already.
Do we really need a sixth?

My feeling is that we should have:
- one decompressor which is the fastest
- one decompressor for the highest compression ratio
- one popular decompressor (eg conventional gzip)
the problem gets more complicated as the "fastest" decompressor usually
creates larger images which need more time to load from the storage, e.g. a
one MB larger image on a 10 MB/s storage (note: bootloaders often configure
the storage controllers in slow modes) gives 100 ms more boot time, thus
eating the gain of a "fast decompressor".

Egon


Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels

Russell King - ARM Linux <linux@...>
 

On Mon, Jan 28, 2013 at 02:25:10PM -0800, Andrew Morton wrote:
What's this "with enabled unaligned memory access" thing? You mean "if
the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"? If so,
that's only x86, which isn't really in the target market for this
patch, yes?

It's a lot of code for a 50ms boot-time improvement. Does anyone have
any opinions on whether or not the benefits are worth the cost?
Well... when I saw this my immediate reaction was "oh no, yet another
decompressor for the kernel". We have five of these things already.
Do we really need a sixth?

My feeling is that we should have:
- one decompressor which is the fastest
- one decompressor for the highest compression ratio
- one popular decompressor (eg conventional gzip)

And if we have a replacement one for one of these, then it should do
exactly that: replace it. I realise that various architectures will
behave differently, so we should really be looking at numbers across
several arches.

Otherwise, where do we stop adding new ones? After we have 6 of these
(which is after this one). After 12? After the 20th?


Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels

Richard Cochran <richardcochran@...>
 

On Mon, Jan 28, 2013 at 02:25:10PM -0800, Andrew Morton wrote:

It's a lot of code for a 50ms boot-time improvement. Does anyone have
any opinions on whether or not the benefits are worth the cost?
In the embedded space, quick boot is a really important feature to
have. Many people resort to awful hacks in order to improve boot time,
and so I would welcome this option.

I have seen arm systems that boot in 300 ms. I would say that 50 ms is
maybe not such a small improvement after all.

Thanks,
Richard


Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels

H. Peter Anvin <hpa@...>
 

Uhm... you're saying we have to be at one extreme or the other?

We probably could drop the legacy lzma format, but someone might rely on it.

Nicolas Pitre <nico@...> wrote:

On Mon, 28 Jan 2013, Andrew Morton wrote:

On Sat, 26 Jan 2013 14:50:43 +0900
Kyungsik Lee <kyungsik.lee@...> wrote:

This patchset is for supporting LZ4 compressed kernel and initial
ramdisk on
the x86 and ARM architectures.

According to http://code.google.com/p/lz4/, LZ4 is a very fast
lossless
compression algorithm and also features an extremely fast decoder.

Kernel Decompression APIs are based on implementation by Yann
Collet
(http://code.google.com/p/lz4/source/checkout).
De/compression Tools are also provided from the site above.

The initial test result on ARM(v7) based board shows that the size
of kernel
with LZ4 compressed is 8% bigger than LZO compressed but the
decompressing
speed is faster(especially under the enabled unaligned memory
access).

Test: 3.4 based kernel built with many modules
Uncompressed kernel size: 13MB
lzo: 6.3MB, 301ms
lz4: 6.8MB, 251ms(167ms, with enabled unaligned memory access)

It seems that it___s worth trying LZ4 compressed kernel image or
ramdisk
for making the kernel boot more faster.

...

20 files changed, 663 insertions(+), 3 deletions(-)

...
What's this "with enabled unaligned memory access" thing? You mean
"if
the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"? If so,
that's only x86, which isn't really in the target market for this
patch, yes?
I'm guessing this is referring to commit 5010192d5a.

It's a lot of code for a 50ms boot-time improvement. Does anyone
have
any opinions on whether or not the benefits are worth the cost?
Well, we used to have only one compressed format. Now we have nearly
half a dozen, with the same worthiness issue between themselves.
Either we keep it very simple, or we make it very flexible. The former

would argue in favor of removing some of the existing formats, the
later
would let this new format in.


Nicolas
--
Sent from my mobile phone. Please excuse brevity and lack of formatting.


Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels

Nicolas Pitre <nico@...>
 

On Mon, 28 Jan 2013, Andrew Morton wrote:

On Sat, 26 Jan 2013 14:50:43 +0900
Kyungsik Lee <kyungsik.lee@...> wrote:

This patchset is for supporting LZ4 compressed kernel and initial ramdisk on
the x86 and ARM architectures.

According to http://code.google.com/p/lz4/, LZ4 is a very fast lossless
compression algorithm and also features an extremely fast decoder.

Kernel Decompression APIs are based on implementation by Yann Collet
(http://code.google.com/p/lz4/source/checkout).
De/compression Tools are also provided from the site above.

The initial test result on ARM(v7) based board shows that the size of kernel
with LZ4 compressed is 8% bigger than LZO compressed but the decompressing
speed is faster(especially under the enabled unaligned memory access).

Test: 3.4 based kernel built with many modules
Uncompressed kernel size: 13MB
lzo: 6.3MB, 301ms
lz4: 6.8MB, 251ms(167ms, with enabled unaligned memory access)

It seems that it___s worth trying LZ4 compressed kernel image or ramdisk
for making the kernel boot more faster.

...

20 files changed, 663 insertions(+), 3 deletions(-)

...
What's this "with enabled unaligned memory access" thing? You mean "if
the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"? If so,
that's only x86, which isn't really in the target market for this
patch, yes?
I'm guessing this is referring to commit 5010192d5a.

It's a lot of code for a 50ms boot-time improvement. Does anyone have
any opinions on whether or not the benefits are worth the cost?
Well, we used to have only one compressed format. Now we have nearly
half a dozen, with the same worthiness issue between themselves.
Either we keep it very simple, or we make it very flexible. The former
would argue in favor of removing some of the existing formats, the later
would let this new format in.


Nicolas


Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels

kyungsik.lee <kyungsik.lee@...>
 

On 2013-01-29 오전 7:25, Andrew Morton wrote:
On Sat, 26 Jan 2013 14:50:43 +0900
Kyungsik Lee <kyungsik.lee@...> wrote:

This patchset is for supporting LZ4 compressed kernel and initial ramdisk on
the x86 and ARM architectures.

According to http://code.google.com/p/lz4/, LZ4 is a very fast lossless
compression algorithm and also features an extremely fast decoder.

Kernel Decompression APIs are based on implementation by Yann Collet
(http://code.google.com/p/lz4/source/checkout).
De/compression Tools are also provided from the site above.

The initial test result on ARM(v7) based board shows that the size of kernel
with LZ4 compressed is 8% bigger than LZO compressed but the decompressing
speed is faster(especially under the enabled unaligned memory access).

Test: 3.4 based kernel built with many modules
Uncompressed kernel size: 13MB
lzo: 6.3MB, 301ms
lz4: 6.8MB, 251ms(167ms, with enabled unaligned memory access)

It seems that it___s worth trying LZ4 compressed kernel image or ramdisk
for making the kernel boot more faster.

...

20 files changed, 663 insertions(+), 3 deletions(-)

...
What's this "with enabled unaligned memory access" thing? You mean "if
the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"? If so,
that's only x86, which isn't really in the target market for this
patch, yes?
Yes, exactly. If the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS,

then it is expected more boot-time improvement by LZ4-decompressor.

Currently there are two architectures which support it in mainline; x86 and powerpc.
And it is expected that ARM arch(v6 or above) also support it since the commit below.
Commit ID: 5010192d5
ARM: 7583/1: decompressor: Enable unaligned memory access for v6 and above
by Dave Martin

The test results(167ms) come from the ARM(v7 arch), MSM8960 based board with
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS set.



It's a lot of code for a 50ms boot-time improvement. Does anyone have
any opinions on whether or not the benefits are worth the cost?
Not only for the kernel but also the ramdisk can be compressed with LZ4 so
the boot-time would be more improved. The test case above didn't include
the decompressing time result for LZ4-compressed ramdisk.

So far the implementation is applicable to boot-time improvement for
LZ4-compressed kernel and ramdisk images but the decompressor module is
exported as an interface for other usages like LZO.
With LZ4 compressor(not yet implemented for the kernel), it is expected
that it will be used in many places in kernel such as crypto and fs(squashfs, btrfs).

Thanks,
Kyungsik


Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels

Andrew Morton
 

On Sat, 26 Jan 2013 14:50:43 +0900
Kyungsik Lee <kyungsik.lee@...> wrote:

This patchset is for supporting LZ4 compressed kernel and initial ramdisk on
the x86 and ARM architectures.

According to http://code.google.com/p/lz4/, LZ4 is a very fast lossless
compression algorithm and also features an extremely fast decoder.

Kernel Decompression APIs are based on implementation by Yann Collet
(http://code.google.com/p/lz4/source/checkout).
De/compression Tools are also provided from the site above.

The initial test result on ARM(v7) based board shows that the size of kernel
with LZ4 compressed is 8% bigger than LZO compressed but the decompressing
speed is faster(especially under the enabled unaligned memory access).

Test: 3.4 based kernel built with many modules
Uncompressed kernel size: 13MB
lzo: 6.3MB, 301ms
lz4: 6.8MB, 251ms(167ms, with enabled unaligned memory access)

It seems that it___s worth trying LZ4 compressed kernel image or ramdisk
for making the kernel boot more faster.

...

20 files changed, 663 insertions(+), 3 deletions(-)

...
What's this "with enabled unaligned memory access" thing? You mean "if
the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"? If so,
that's only x86, which isn't really in the target market for this
patch, yes?

It's a lot of code for a 50ms boot-time improvement. Does anyone have
any opinions on whether or not the benefits are worth the cost?


ELC schedule available - technical showcase proposals wanted

Tim Bird <tim.bird@...>
 

Hi everyone,

The schedule for Embedded Linux Conference is now available at:
http://events.linuxfoundation.org/events/embedded-linux-conference/schedule

ELC is February 20-22 in San Francisco. You'll find a variety of
interesting and useful topics on the schedule. I hope you can attend.

If you're coming, and would like to have a table and poster at our
technical showcase, please send me an e-mail.

The technical showcase is an informal "poster/demo session", with
an evening reception, planned for Wednesday, February 20 at ELC.

Full details about the technical showcase are here:
http://elinux.org/ELC_2013_Technical_Showcase

I'm looking forward to another great event, and hope to see
you there!
-- Tim

=============================
Tim Bird
Architecture Group Chair, CE Workgroup of the Linux Foundation
Senior Staff Engineer, Sony Network Entertainment
=============================


Videos of the Embedded Linux Conference Europe 2012

Michael Opdenacker
 

Greetings,

Here are the videos of the latest edition of the Embedded Linux
Conference Europe (ELC-E):
http://free-electrons.com/blog/elce-2012-videos/

Make sure you don't miss the upcoming ELC next month. The early bird
rate for ELC expires soon by the way.

You can download videos for the topics you're most interested in and
view them during your flight to ELC ;)

Thanks to Thomas Petazzoni who took care of most of the video encoding
and publishing work!

Cheers,

Michael and

--
Michael Opdenacker, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
+33 484 258 098


[PATCH v3] Sample misc loop driver (was LDT - Linux Driver Template)

Constantine Shulyupin
 

From: Constantine Shulyupin <const@...>

misc_loop_drv - simple sample driver with validation script.
It implements loopback device with blocking and
nonblocking read and write file interfaces.
It is useful for Linux driver development beginners
and hackers. It can be used as starting point for a new char
device drivers.
The driver uses following Linux facilities:
misc device, file operations (read/write, blocking and
nonblocking mode, polling), kfifo, interrupt, I/O), tasklet.

Signed-off-by: Constantine Shulyupin <const@...>

---

Changelog

Since v2

- significantly simplified, removed mmap, ioctl, timer, kthread, Device Model,
debugfs, ftrace, UART
- renamed from ldt to misc_loop_drv
- utilised pr_fmt, kernel-doc

Since v1
- removed tracing macros and replaced with pr_debug
- removed multiple char devices (alloc_chrdev_region, class_create) and used only misc device
- added module parameters descriptions
- fixed braces
- wrapped long lines
- fixed declaration and initialization of some variables
- improved comments
- reworked some functions
- added blocking write
- edited log messages
- added more error checking
- added tabulation of structs
- removed standalone module_init and module_exit in favour module_platform_driver
- added driver data struct
---
samples/Kconfig | 2 +
samples/Makefile | 5 +-
samples/misc_loop_drv/Kconfig | 22 ++
samples/misc_loop_drv/Makefile | 3 +
samples/misc_loop_drv/misc_loop_drv.c | 344 ++++++++++++++++++++++++++++++
samples/misc_loop_drv/misc_loop_drv_test | 46 ++++
6 files changed, 420 insertions(+), 2 deletions(-)
create mode 100644 samples/misc_loop_drv/Kconfig
create mode 100644 samples/misc_loop_drv/Makefile
create mode 100644 samples/misc_loop_drv/misc_loop_drv.c
create mode 100755 samples/misc_loop_drv/misc_loop_drv_test

diff --git a/samples/Kconfig b/samples/Kconfig
index 7b6792a..479ff41 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -69,4 +69,6 @@ config SAMPLE_RPMSG_CLIENT
to communicate with an AMP-configured remote processor over
the rpmsg bus.

+source "samples/misc_loop_drv/Kconfig"
+
endif # SAMPLES
diff --git a/samples/Makefile b/samples/Makefile
index 5ef08bb..6a9f374 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -1,4 +1,5 @@
# Makefile for Linux samples code

-obj-$(CONFIG_SAMPLES) += kobject/ kprobes/ tracepoints/ trace_events/ \
- hw_breakpoint/ kfifo/ kdb/ hidraw/ rpmsg/ seccomp/
+obj-$(CONFIG_SAMPLES) += kobject/ kprobes/ tracepoints/ trace_events/
+obj-$(CONFIG_SAMPLES) += hw_breakpoint/ kfifo/ kdb/ hidraw/ rpmsg/ seccomp/
+obj-$(CONFIG_SAMPLES) += misc_loop_drv/
diff --git a/samples/misc_loop_drv/Kconfig b/samples/misc_loop_drv/Kconfig
new file mode 100644
index 0000000..17efa14
--- /dev/null
+++ b/samples/misc_loop_drv/Kconfig
@@ -0,0 +1,22 @@
+config SAMPLE_MISC_LOOP_DRIVER
+ tristate "Build misc_loop_drv - Simple misc loop device driver sample with test script"
+ depends on m
+ help
+ It's simple sample driver with validation script.
+ It implements loopback device with blocking and
+ nonblocking read and write file interfaces.
+ It is useful for Linux driver development beginners
+ and hackers. It can be used as starting point for a new char
+ device drivers.
+ The driver uses following Linux facilities:
+ misc device, file operations (read/write, blocking and
+ nonblocking mode, polling), kfifo, interrupt, I/O), tasklet.
+ You can run the driver with test script misc_drv_test.
+
+ List of other samples and skeletons can be found at http://elinux.org/Device_drivers
+
+config SAMPLE_MISC_LOOP_DRIVER_DEBUG
+ bool "misc_loop_drv debugging messages"
+ depends on SAMPLE_MISC_LOOP_DRIVER
+ help
+ Enable debug messages for misc_loop_drv.
diff --git a/samples/misc_loop_drv/Makefile b/samples/misc_loop_drv/Makefile
new file mode 100644
index 0000000..87d7878
--- /dev/null
+++ b/samples/misc_loop_drv/Makefile
@@ -0,0 +1,3 @@
+ccflags-y += -D DEBUG
+obj-$(CONFIG_SAMPLE_MISC_LOOP_DRIVER)+= misc_loop_drv.o
+ccflags-$(CONFIG_SAMPLE_MISC_LOOP_DRIVER_DEBUG) := -DDEBUG
diff --git a/samples/misc_loop_drv/misc_loop_drv.c b/samples/misc_loop_drv/misc_loop_drv.c
new file mode 100644
index 0000000..0fa9a97
--- /dev/null
+++ b/samples/misc_loop_drv/misc_loop_drv.c
@@ -0,0 +1,344 @@
+/*
+ * misc_loop_drv
+ *
+ * Simple misc device driver sample
+ *
+ * Copyright (C) 2012 Constantine Shulyupin http://www.makelinux.net/
+ *
+ * Licensed under the GPLv2.
+ *
+ * The driver demonstrates usage of following Linux facilities:
+ *
+ * Linux kernel module
+ * simple single misc device file (miscdevice, misc_register)
+ * file_operations
+ * read and write
+ * blocking read and write
+ * polling
+ * kfifo
+ * interrupt
+ * io
+ * tasklet
+ *
+ * Run test script misc_loop_drv_test to test the driver
+ *
+ */
+
+#include <linux/io.h>
+#include <linux/ioport.h>
+#include <linux/mm.h>
+#include <linux/interrupt.h>
+#include <linux/sched.h>
+#include <linux/delay.h>
+#include <linux/kfifo.h>
+#include <linux/fs.h>
+#include <linux/poll.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/miscdevice.h>
+#include <linux/serial_reg.h>
+#include <linux/cdev.h>
+#include <asm/apic.h>
+
+#undef pr_fmt
+#define pr_fmt(fmt) "%s.c:%d %s " fmt, KBUILD_MODNAME, __LINE__, __func__
+
+/*
+ * It's supposed you computer doesn't use floppy device.
+ * This drivers uses IRQ and port of floppy device for demonstration.
+*/
+
+static int port = 0x3f4;
+module_param(port, int, 0);
+MODULE_PARM_DESC(port, "I/O port number, default 0x3f4 - floppy");
+
+static int port_size = 2;
+module_param(port_size, int, 0);
+MODULE_PARM_DESC(port_size, "size of I/O port mapping, default 2");
+
+static int irq = 6;
+module_param(irq, int, 0);
+MODULE_PARM_DESC(irq, "interrupt request number, default 6 - floppy");
+
+/*
+ * Offsets of registers in port_emulation
+ *
+ * Pay attention that MISC_DRV_TX == MISC_DRV_RX and MISC_DRV_TX_FULL == MISC_DRV_RX_READY.
+ * Transmitted data becomes received and emulates port in loopback mode.
+ */
+
+#define MISC_DRV_TX 0
+#define MISC_DRV_RX 0
+#define MISC_DRV_TX_FULL 1
+#define MISC_DRV_RX_READY 1
+
+static char port_emulation[2]; /* array to emulate I/0 port */
+
+#define FIFO_SIZE 128 /* must be power of two */
+
+/**
+ * struct misc_loop_drv_data - the driver data
+ * @in_fifo: input queue for write
+ * @out_fifo: output queue for read
+ * @fifo_lock: lock for queues
+ * @readable: waitqueue for blocking read
+ * @writeable: waitqueue for blocking write
+ * @port_ptr: mapped io port
+ *
+ * Can be retrieved from platform_device with
+ * struct misc_loop_drv_data *drvdata = platform_get_drvdata(pdev);
+ */
+
+struct misc_loop_drv_data {
+ struct mutex read_lock;
+ struct mutex write_lock;
+ DECLARE_KFIFO(in_fifo, char, FIFO_SIZE);
+ DECLARE_KFIFO(out_fifo, char, FIFO_SIZE);
+ spinlock_t fifo_lock;
+ wait_queue_head_t readable, writeable;
+ struct tasklet_struct misc_loop_drv_tasklet;
+ void __iomem *port_ptr;
+ struct resource *port_res;
+};
+
+static struct misc_loop_drv_data *drvdata;
+
+static void misc_loop_drv_tasklet_func(unsigned long d)
+{
+ char data_out, data_in;
+ struct misc_loop_drv_data *drvdata = (void *)d;
+
+ while (!ioread8(drvdata->port_ptr + MISC_DRV_TX_FULL)
+ && kfifo_out_spinlocked(&drvdata->out_fifo,
+ &data_out, sizeof(data_out), &drvdata->fifo_lock)) {
+ wake_up_interruptible(&drvdata->writeable);
+ pr_debug("data_out=%d %c\n", data_out, data_out >= 32 ? data_out : ' ');
+ iowrite8(data_out, drvdata->port_ptr + MISC_DRV_TX);
+ /* set full tx flag and implicitly rx ready flag */
+ iowrite8(1, drvdata->port_ptr + MISC_DRV_TX_FULL);
+ /*
+ In regular drivers hardware invokes interrupts.
+ Because this drivers works without real hardware
+ we simulate interrupt invocation with function send_IPI_all.
+ In driver, which works with real hardware this is not required.
+ */
+ apic->send_IPI_all(IRQ0_VECTOR+irq);
+ }
+ while (ioread8(drvdata->port_ptr + MISC_DRV_RX_READY)) {
+ data_in = ioread8(drvdata->port_ptr + MISC_DRV_RX);
+ pr_debug("data_in=%d %c\n", data_in, data_in >= 32 ? data_in : ' ');
+ kfifo_in_spinlocked(&drvdata->in_fifo, &data_in,
+ sizeof(data_in), &drvdata->fifo_lock);
+ wake_up_interruptible(&drvdata->readable);
+ /* clear rx ready flag and implicitly tx full flag */
+ iowrite8(0, drvdata->port_ptr + MISC_DRV_RX_READY);
+ }
+}
+
+static irqreturn_t misc_loop_drv_isr(int irq, void *d)
+{
+ struct misc_loop_drv_data *drvdata = (void *)d;
+
+ tasklet_schedule(&drvdata->misc_loop_drv_tasklet);
+ return IRQ_HANDLED;
+}
+
+static int misc_loop_drv_open(struct inode *inode, struct file *file)
+{
+ pr_debug("from %s\n", current->comm);
+ /* client related data can be allocated here and
+ stored in file->private_data */
+ return 0;
+}
+
+static int misc_loop_drv_release(struct inode *inode, struct file *file)
+{
+ pr_debug("from %s\n", current->comm);
+ /* client related data can be retrieved from file->private_data
+ and released here */
+ return 0;
+}
+
+static ssize_t misc_loop_drv_read(struct file *file, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ int ret = 0;
+ unsigned int copied;
+
+ pr_debug("from %s\n", current->comm);
+ if (kfifo_is_empty(&drvdata->in_fifo)) {
+ if (file->f_flags & O_NONBLOCK) {
+ return -EAGAIN;
+ } else {
+ pr_debug("%s\n", "waiting");
+ ret = wait_event_interruptible(drvdata->readable,
+ !kfifo_is_empty(&drvdata->in_fifo));
+ if (ret == -ERESTARTSYS) {
+ pr_err("interrupted\n");
+ return -EINTR;
+ }
+ }
+ }
+ if (mutex_lock_interruptible(&drvdata->read_lock))
+ return -EINTR;
+ ret = kfifo_to_user(&drvdata->in_fifo, buf, count, &copied);
+ mutex_unlock(&drvdata->read_lock);
+
+ return ret ? ret : copied;
+}
+
+static ssize_t misc_loop_drv_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ int ret;
+ unsigned int copied;
+
+ pr_debug("from %s\n", current->comm);
+ if (kfifo_is_full(&drvdata->out_fifo)) {
+ if (file->f_flags & O_NONBLOCK) {
+ return -EAGAIN;
+ } else {
+ ret = wait_event_interruptible(drvdata->writeable,
+ !kfifo_is_full(&drvdata->out_fifo));
+ if (ret == -ERESTARTSYS) {
+ pr_err("interrupted\n");
+ return -EINTR;
+ }
+ }
+ }
+ if (mutex_lock_interruptible(&drvdata->write_lock))
+ return -EINTR;
+ ret = kfifo_from_user(&drvdata->out_fifo, buf, count, &copied);
+ mutex_unlock(&drvdata->write_lock);
+ tasklet_schedule(&drvdata->misc_loop_drv_tasklet);
+
+ return ret ? ret : copied;
+}
+
+static unsigned int misc_loop_drv_poll(struct file *file, poll_table *pt)
+{
+ unsigned int mask = 0;
+ poll_wait(file, &drvdata->readable, pt);
+ poll_wait(file, &drvdata->writeable, pt);
+
+ if (!kfifo_is_empty(&drvdata->in_fifo))
+ mask |= POLLIN | POLLRDNORM;
+ mask |= POLLOUT | POLLWRNORM;
+/*
+ if case of output end of file set
+ mask |= POLLHUP;
+ in case of output error set
+ mask |= POLLERR;
+*/
+ return mask;
+}
+
+static const struct file_operations misc_loop_drv_fops = {
+ .owner = THIS_MODULE,
+ .open = misc_loop_drv_open,
+ .release = misc_loop_drv_release,
+ .read = misc_loop_drv_read,
+ .write = misc_loop_drv_write,
+ .poll = misc_loop_drv_poll,
+};
+
+static struct miscdevice misc_loop_drv_dev = {
+ .minor = MISC_DYNAMIC_MINOR,
+ .name = KBUILD_MODNAME,
+ .fops = &misc_loop_drv_fops,
+};
+
+/*
+ * Initialization and cleanup section
+ */
+
+static void misc_loop_drv_cleanup(void)
+{
+ if (misc_loop_drv_dev.this_device)
+ misc_deregister(&misc_loop_drv_dev);
+ if (irq)
+ free_irq(irq, drvdata);
+ tasklet_kill(&drvdata->misc_loop_drv_tasklet);
+
+ if (drvdata->port_ptr)
+ ioport_unmap(drvdata->port_ptr);
+ if (drvdata->port_res)
+ release_region(port, port_size);
+ kfree(drvdata);
+}
+
+static struct misc_loop_drv_data *misc_loop_drv_data_init(void)
+{
+ struct misc_loop_drv_data *drvdata;
+
+ drvdata = kzalloc(sizeof(*drvdata), GFP_KERNEL);
+ if (!drvdata)
+ return NULL;
+ init_waitqueue_head(&drvdata->readable);
+ init_waitqueue_head(&drvdata->writeable);
+ INIT_KFIFO(drvdata->in_fifo);
+ INIT_KFIFO(drvdata->out_fifo);
+ mutex_init(&drvdata->read_lock);
+ mutex_init(&drvdata->write_lock);
+ tasklet_init(&drvdata->misc_loop_drv_tasklet,
+ misc_loop_drv_tasklet_func, (unsigned long)drvdata);
+ return drvdata;
+}
+
+static __devinit int misc_loop_drv_init(void)
+{
+ int ret = 0;
+
+ pr_debug("MODNAME=%s\n", KBUILD_MODNAME);
+ pr_debug("port = %X irq = %d\n", port, irq);
+
+ drvdata = misc_loop_drv_data_init();
+ if (!drvdata) {
+ pr_err("misc_loop_drv_data_init failed\n");
+ goto exit;
+ }
+
+ drvdata->port_res = request_region(port, port_size, KBUILD_MODNAME);
+ if (!drvdata->port_res) {
+ pr_err("request_region failed\n");
+ return -EBUSY;
+ }
+ /*
+ Real I/O port should be mapped with function with ioport_map:
+ drvdata->port_ptr = ioport_map(port, port_size);
+ But, because we work in emulation mode, we use array instead mapped ports
+ */
+ drvdata->port_ptr = (void __iomem *) port_emulation;
+ if (!drvdata->port_ptr) {
+ pr_err("ioport_map failed\n");
+ return -ENODEV;
+ }
+ /* clear ports */
+ iowrite8(0, drvdata->port_ptr + MISC_DRV_TX);
+ iowrite8(0, drvdata->port_ptr + MISC_DRV_TX_FULL);
+
+ ret = misc_register(&misc_loop_drv_dev);
+ if (ret < 0) {
+ pr_err("misc_register failed\n");
+ goto exit;
+ }
+ pr_debug("misc_loop_drv_dev.minor=%d\n", misc_loop_drv_dev.minor);
+ ret = request_irq(irq, misc_loop_drv_isr, 0, KBUILD_MODNAME, drvdata);
+ if (ret < 0) {
+ pr_err("request_irq failed\n");
+ return ret;
+ }
+
+exit:
+ pr_debug("ret=%d\n", ret);
+ if (ret < 0)
+ misc_loop_drv_cleanup();
+ return ret;
+}
+
+module_init(misc_loop_drv_init);
+module_exit(misc_loop_drv_cleanup);
+
+MODULE_DESCRIPTION("misc_loop_drv Simple misc device driver sample");
+MODULE_AUTHOR("Constantine Shulyupin <const@...>");
+MODULE_LICENSE("GPL");
diff --git a/samples/misc_loop_drv/misc_loop_drv_test b/samples/misc_loop_drv/misc_loop_drv_test
new file mode 100755
index 0000000..4f69512
--- /dev/null
+++ b/samples/misc_loop_drv/misc_loop_drv_test
@@ -0,0 +1,46 @@
+#!/bin/bash
+
+# The script is intended to be run from user root
+
+RED="\\033[0;31m"
+NOCOLOR="\\033[0;39m"
+GREEN="\\033[0;32m"
+
+dmesg -c > /dev/null # clear log buffer
+# It's supposed you computer doesn't use floppy device.
+# misc_loop_drv uses IRQ 6 of floppy device demonstrate IRQ usage.
+rmmod floppy 2> /dev/null
+rmmod misc_loop_drv 2> /dev/null
+insmod ./misc_loop_drv.ko
+
+data=123rw
+cat /dev/misc_loop_drv > R.tmp &
+sleep 0.1; echo $data > /dev/misc_loop_drv;
+sleep 0.5
+kill %1; wait %1 2> /dev/null || true
+received=`cat R.tmp`
+rm -f R.tmp
+
+if [ "$data" == "$received" ]; then
+echo -e "${GREEN}misc_loop_drv blocking read/write test passed$NOCOLOR"
+else
+echo -e "${RED}misc_loop_drv blocking read/write test failed$NOCOLOR"
+echo expected $data
+echo received $received
+fi
+
+data=123nb
+echo -n $data > /dev/misc_loop_drv
+sleep 0.5
+received=`dd iflag=nonblock if=/dev/misc_loop_drv 2> /dev/null `
+if [ "$data" == "$received" ]; then
+echo -e "${GREEN}misc_loop_drv nonblocking read/write test passed$NOCOLOR"
+else
+echo -e "${RED}misc_loop_drv nonblock read/write test failed$NOCOLOR"
+echo expected $data
+echo received $received
+fi
+
+rmmod misc_loop_drv.ko
+dmesg --notime --show-delta --read-clear 2>/dev/null > kernel.log || \
+dmesg -c > kernel.log && echo kernel.log saved
--
1.7.9.5