Project proposal: Add pcc support for linux kernel.


Anders Magnusson <ragge@...>
 

Overview: Pcc is a small and simple compiler, yet can produce reasonable good code in
comparison with gcc, even if it runs 5-10 times faster. Pcc supports around a dozen
architectures more or less, even though the main focus these days has been i386 and amd64.

The compilation speed is a relevant factor when doing development, especially when compiling
parts of large projects together, which is quite common on embedded systems.

The proposal has two parts:
First is to ensure that pcc produces correct code for the specialties inside the linux kernel
that uses gcc extensions. This is focused mainly towards the i386 platform which is one
of the most used embedded archs today.

The second is to add support for constant propagation and strength reduction, which is
something that the Linux kernel benefit much from, especially the first item.

Pcc website: http://pcc.ludd.ltu.se

Scope:
Each one of the above items should take about 2 weeks to implement and test.


Rob Landley
 

On Sunday 20 December 2009 09:22:59 Anders Magnusson wrote:
The second is to add support for constant propagation and strength
reduction, which is
something that the Linux kernel benefit much from, especially the first
item.
Constant propogation is actually required by c99. (I had to learn this stuff
back when I was maintaining a tinycc fork, before the original version
relaunched itself as a windows-only project.)

Also, does it do simple dead code elimination? Both the kernel and busybox do
a lot of if(MACRO_RESOLVES_TO_ZERO) { blah; } instead of #ifdefs.

Rob
--
Latency is more important than throughput. It's that simple. - Linus Torvalds


Anders Magnusson <ragge@...>
 

Rob Landley wrote:
On Sunday 20 December 2009 09:22:59 Anders Magnusson wrote:

The second is to add support for constant propagation and strength
reduction, which is
something that the Linux kernel benefit much from, especially the first
item.
Constant propogation is actually required by c99. (I had to learn this stuff back when I was maintaining a tinycc fork, before the original version relaunched itself as a windows-only project.)
Hm, not according to my version of the C99 standard anyway. Pcc supports everything
required by C99 AFAIK. Things missing is mostly in the gcc compat area.

Also, does it do simple dead code elimination? Both the kernel and busybox do a lot of if(MACRO_RESOLVES_TO_ZERO) { blah; } instead of #ifdefs.
Yes, that type of dead code elimination has been there at least 30 years :-) It also has a graph-coloring register allocator and can do SSA conversion which is
of great benefit for code quality.

-- Ragge


Rob Landley
 

On Monday 21 December 2009 02:56:11 Anders Magnusson wrote:
Rob Landley wrote:
On Sunday 20 December 2009 09:22:59 Anders Magnusson wrote:
The second is to add support for constant propagation and strength
reduction, which is
something that the Linux kernel benefit much from, especially the first
item.
Constant propogation is actually required by c99. (I had to learn this
stuff back when I was maintaining a tinycc fork, before the original
version relaunched itself as a windows-only project.)
Hm, not according to my version of the C99 standard anyway. Pcc supports
everything
required by C99 AFAIK. Things missing is mostly in the gcc compat area.
Actually I was referring to section 6.6, "constant expressions"

http://busybox.net/~landley/c99-draft.html#6.6

You're referring to remembering that a constant value assigned to a variable
can't have been modified since, which is a slightly different thing and probably
not required by the spec. (I just remember it was one of the few
optimizations tinycc bothered to do.)

Also, does it do simple dead code elimination? Both the kernel and
busybox do a lot of if(MACRO_RESOLVES_TO_ZERO) { blah; } instead of
#ifdefs.
Yes, that type of dead code elimination has been there at least 30 years

:-)
Alas, tinycc didn't do it. (It was on my todo list, the tricky bit was goto
labels that could jump _back_ into a block, in a one pass compiler I needed to
cache the to-be-discarded code into temporary buffers in case it turned out to
be needed, or something.)

It also has a graph-coloring register allocator and can do SSA
conversion which is
of great benefit for code quality.
Coolness.

Any way to cache the lexx/yacc output in the source tarball so those aren't
prerequisites for building it? (Perhaps that's already been done, my
knowledge of this project is a year and change out of date.)

-- Ragge
Rob
--
Latency is more important than throughput. It's that simple. - Linus Torvalds


Anders Magnusson <ragge@...>
 

Rob Landley wrote:
On Monday 21 December 2009 02:56:11 Anders Magnusson wrote:

Rob Landley wrote:

On Sunday 20 December 2009 09:22:59 Anders Magnusson wrote:

The second is to add support for constant propagation and strength
reduction, which is
something that the Linux kernel benefit much from, especially the first
item.
Constant propogation is actually required by c99. (I had to learn this
stuff back when I was maintaining a tinycc fork, before the original
version relaunched itself as a windows-only project.)
Hm, not according to my version of the C99 standard anyway. Pcc supports
everything
required by C99 AFAIK. Things missing is mostly in the gcc compat area.
Actually I was referring to section 6.6, "constant expressions"

http://busybox.net/~landley/c99-draft.html#6.6

You're referring to remembering that a constant value assigned to a variable can't have been modified since, which is a slightly different thing and probably not required by the spec. (I just remember it was one of the few optimizations tinycc bothered to do.)
Yes, true. Doing early evaluation of constant expressions is (of course :-) done
in pcc.

Also, does it do simple dead code elimination? Both the kernel and
busybox do a lot of if(MACRO_RESOLVES_TO_ZERO) { blah; } instead of
#ifdefs.
Yes, that type of dead code elimination has been there at least 30 years

:-)
Alas, tinycc didn't do it. (It was on my todo list, the tricky bit was goto labels that could jump _back_ into a block, in a one pass compiler I needed to cache the to-be-discarded code into temporary buffers in case it turned out to be needed, or something.)
Variables that evaluates to constants and then used in conditionals are easy to track
after SSA conversion and constant propagation, and pcc already has the elimination
code for it in place. It's just constant propagation that is missing :-)

It also has a graph-coloring register allocator and can do SSA
conversion which is
of great benefit for code quality.
Coolness.

Any way to cache the lexx/yacc output in the source tarball so those aren't prerequisites for building it? (Perhaps that's already been done, my knowledge of this project is a year and change out of date.)
It would be possible, but not donet, mostly because noone has asked about it :-)
Also, there are often OS-specific code included in the yacc/lex output, so it's not
just to ship the output file. But it wouldn't be difficult to do.

-- Ragge
Rob
-- Ragge


Rob Landley
 

On Monday 21 December 2009 10:40:41 Anders Magnusson wrote:
Alas, tinycc didn't do it. (It was on my todo list, the tricky bit was
goto labels that could jump _back_ into a block, in a one pass compiler I
needed to cache the to-be-discarded code into temporary buffers in case
it turned out to be needed, or something.)
Variables that evaluates to constants and then used in conditionals are
easy to track
after SSA conversion and constant propagation, and pcc already has the
elimination
code for it in place. It's just constant propagation that is missing :-)
That's one of the few optimizations tinycc already did, so I tend to assume
it's in everything. :)

It also has a graph-coloring register allocator and can do SSA
conversion which is
of great benefit for code quality.
Coolness.

Any way to cache the lexx/yacc output in the source tarball so those
aren't prerequisites for building it? (Perhaps that's already been done,
my knowledge of this project is a year and change out of date.)
It would be possible, but not donet, mostly because noone has asked
about it :-)
Also, there are often OS-specific code included in the yacc/lex output,
so it's not
just to ship the output file. But it wouldn't be difficult to do.
It would be nice. I'm building self-bootstrapping Linux systems for various
targets so I can regression test linux, uClibc, and busybox built for each
target under qemu, meaning I try to get the dependencies down to a minimal
set. (Yes, I've been patching the newly introduced perl build dependency out
of the Linux kernel for over a year now.)

I'd be happy to regression test compiler versions too, but my gcc and binutils
are stuck at 4.2.1 and 2.17 (last GPLv2 release of each) and long term I'll
need to replace 'em to get armv7 and the new mips api and such. From a
dependency standpoint, replacing two packages with pcc+lex+yacc isn't an
improvement. (And last time I poked at it, building on Linux wasn't
particularly documented yet. I admit, it's been a while.)

Rob
--
Latency is more important than throughput. It's that simple. - Linus Torvalds


Tim Bird <tim.bird@...>
 

Anders Magnusson wrote:
Rob Landley wrote:
On Sunday 20 December 2009 09:22:59 Anders Magnusson wrote:

The second is to add support for constant propagation and strength
reduction, which is
something that the Linux kernel benefit much from, especially the first
item.
Somehow, I didn't receive the original e-mail for this. But
I've made a page for it (based on the celinux-dev archive) at:
http://elinux.org/CELF_Project_Proposal/Add_pcc_support_for_Linux_kernel

Thanks for the proposal.
-- Tim

=============================
Tim Bird
Architecture Group Chair, CE Linux Forum
Senior Staff Engineer, Sony Corporation of America
=============================