NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
The .a file is a relic: Why static archives were a bad idea all along (medium.com)
TuxSH 1 days ago [-]
> This design decision at the source level, means that in our linked binary we might not have the logic for the 3DES building block, but we would still have unused decryption functions for AES256.

Do people really not know about `-ffunction-sections -fdata-sections` & `-Wl,--gc-sections` (doesn't require LTO)? Why is it used so little when doing statically-linked builds?

> Let’s say someone in our library designed the following logging module: (...)

Relying on static initialization order, and on runtime static initialization at all, is never a good idea IMHO

flohofwoe 1 days ago [-]
There's also the other 'old-school' method to compile each function into its own object file, I guess that's why MUSL has each function in its own source file:

https://github.com/kraj/musl/tree/kraj/master/src/stdio

...but these days -flto is simply the better option to get rid of unused code and data - and enable more optimizations on top. LTO is also exactly why static linking is strictly better than dynamic linking, unless dynamic linking is absolutely required (for instance at the operating system boundary).

pjmlp 15 hours ago [-]
Or plugins, or hotcode reload techniques, unless people want to go the OS IPC route that seems forgotten to many, while being safer, but naturally more resource demanding.
astrobe_ 1 days ago [-]
Yes, these are really esoteric options, and IIRC GCC's docs say they can be counter-productive.
readmodifywrite 1 days ago [-]
One engineer's esoteric is another's daily driver. All 3 of those options are borderline mandatory in embedded firmware development.
TuxSH 1 days ago [-]
These options can easily be found by a Google Search or via LLM, whichever one prefers

> they can be counter-productive

Rarely[1]. The only side effect this can have is the constant pools (for ldr rX, [pc, #off] kind of stuff) not being merged, but the negative impact is absolutely minimal (different functions usually use different constants after all!)

([1] assuming elf file format or elf+objcopy output)

There are many other upsides too: you can combine these options with -Wl,-wrap to e.g. prune exception symbols from already-compiled libraries and make the resulting binaries even smaller (depending on platform)

The question is, why are function-sections and data-sections not the default?

It is quite annoying to have to deal with static libs (including standard libraries themselves) that were compiled with neither these flags nor LTO.

dmitrygr 1 days ago [-]
Esoteric? In embedded, people know of these from BEFORE they stop wearing diapers
jeffbee 1 days ago [-]
-ffunction-sections has 750k hits on github. It is among the default flags for opt mode builds in Bazel. There are probably people who consider them defaults, in practice.
astrobe_ 1 days ago [-]
Well, C and C++ together have around 7M repos, so about 10%. Actually not entirely esoteric, but Github is only a fraction of the world's codebase and users of these repos probably never looked in the makefile, so I'd say 10% of C/C++ developers knowing about this is a very optimistic estimate.
wtallis 23 hours ago [-]
Looking at GitHub is probably significantly undersampling the kinds of C projects that would be doing static linking, many of which pre-date GitHub.
1 days ago [-]
greenavocado 1 days ago [-]
> Do people really not know about $OBSCURE_GCC_FLAG?

Do you know what you sound like?

alextingle 1 days ago [-]
If your whole business revolves around shipping static libraries to customers, then surely reading the man page to work out how that can be done is part of the job.

Also, having your library rely on static initialisation doesn't seem like a very sound architectural choice.

Honestly, this just sounds like whining from someone who can't be bothered to read the documentation, and can't get the coders to take him seriously. In the time it took him to write that article, he could have read up on his tools, and probably learned some social skills too.

izacus 24 hours ago [-]
And do you know how you sound like when you call a well known, first hit on Google/AI, set of flags as obscure?
1 days ago [-]
viraptor 1 days ago [-]
https://xkcd.com/2501/

... It's easy to forget that the average person probably only knows two or three linker flags ...

pjmlp 1 days ago [-]
How can they be expected to learn this, when it is now fashionable to treat C and C++ as if they are scripting languages, shipping header only files?

We already had scripting engines for those languages in the 1990's, and the fact they are hardly available nowadays kind of tells of their commercial success, with exception of ROOT.

asveikau 1 days ago [-]
It makes more sense for c++ due to templates, but the header only C library trend is indeed very strange. It's not surprising that people are coming up now who are writing articles about being confused by static linking behavior.
pjmlp 1 days ago [-]
Even with C++ templates, if you want faster builds, header files aren't the place to store external templates, which are instantiations for common type parameters.
TuxSH 1 days ago [-]
Header-only is simpler to integrate, so it makes sense for simple stuff, or stuff that is going to be used by only one TU there.

However, the semantics of inline are different between C and C++. To put it simply, C is restricted to static inline and, for variables, static const, whereas C++ has no such limitations (making them a superset); and static inline/const can sometimes lead to binary size bloat

CyberDildonics 1 days ago [-]
It's not strange at all. You only have one file to keep track of and it does everything, you put the functions in any compilation unit you want, C compilation is basically instant, and putting a bunch of single file libraries into one compilation unit simplifies things further.
pjmlp 15 hours ago [-]
That might be why back in 1999 - 2002, I was waiting around 1h for each OS build variant of our product, a mix of Tcl and C native libraries, super fast.

It is only basically instant in toy examples, or optimizations completely disabled.

CyberDildonics 8 hours ago [-]
Sqlite is 6 MB and can compile in 2 seconds on msvc.

It's 25 years later, if you are waiting for an hour to compile a single normal C program, there is a lot of room for optimization. Saying C doesn't compile fast because 25 years ago your own company made a program that compiled slow is pretty silly.

Single file C libraries are fantastic because they can easily all be put into more monolithic compilation units which makes compilation very fast and they probably won't need to be changed.

Have you actually tried what I'm talking about? What are you trying to say here, that you think single file libraries would have made your 1 hours pure C program compile slower?

asveikau 8 hours ago [-]
I used to compile sqlite regularly on msvc and it was more than 2 seconds. If this is a measurement it is a recent one with recent hardware.

Sqlite is a single compilation unit for much very different reasons than the header only libraries by the way. It's developed as many different files and concatenated together for distribution because the optimizer does better for it that way.

CyberDildonics 7 hours ago [-]
If this is a measurement it is a recent one with recent hardware.

This was on a 10 year old Xeon CPU.

Sqlite is a single compilation unit for much very different reasons than the header only libraries by the way. It's developed as many different files and concatenated together for distribution because the optimizer does better for it that way.

What difference does that make? What does it have to do with anything? You could develop anything like that.

The point here is that single file libraries work very well. If anything they end up making compilation times lower because they can be put into monolithic compilation units. When I say single file C libraries, I'm talking about single files that have a header switch to make them header declarations or full functions implementations. I'm not sure what you're trying to say.

bsder 24 hours ago [-]
> It makes more sense for c++ due to templates, but the header only C library trend is indeed very strange.

This is a symptom of the build tools ecosystem for C and C++ being an absolute dumpster fire (looking at you CMake).

TuxSH 1 days ago [-]
> How can they be expected to learn this

It's the first thing Google and LLMs 'tell' you when you ask about reducing binary size with static libraries. Also LTO does most of the same.

pjmlp 1 days ago [-]
To learn, first one needs to want to learn, which was my whole point.
TuxSH 1 days ago [-]
Agreed, but article's author mentioned this as an issue, I would have expected him to find about and mention these flags as well.
flohofwoe 1 days ago [-]
An STB-style header-only library is actually quite perfect for eliminating dead code if the implementation and all code using that library is in the same compilation unit (since the compiler will not include static functions into the build that are not called).

...or build -flto for the 'modern' catch-all feature to eliminate any dead code.

...apart from that, none of the problems outlined in the blog post apply to header only libraries anyway since they are not distributed as precompiled binaries.

amiga386 1 days ago [-]
> Yet, what if the logger’s ctor function is implemented in a different object file?

This is a contrived example akin to "what if I only know the name of the function at runtime and have to dlsym()"?

Have a macro that "enables use of" the logger that the API user must place in global scope, so it can write "extern ctor_name;". Or have library specific additions for LDFLAGS to add --undefined=ctor_name

There are workarounds for this niche case, and it doesn't add up to ".a files were a bad idea", that's just clickbait. You'll appreciate static linkage more on the day after your program survives a dynamic linker exploit

> Every non-static function in the SDK is suddenly a possible cause of naming conflict

Has this person never written a C library before? Step 1: make all globals/functions static unless they're for export. Step 2: give all exported symbols and public header definitions a prefix, like "mylibname_", because linkage has a global namespace. C++ namespaces are just a formalisation of this

Joker_vD 1 days ago [-]
> This is a contrived example akin to "what if I only know the name of the function at runtime and have to dlsym()"?

Well, you just do what the standard Linux loader does: iterate through the .so's in your library path, loading them one by one and doing dlsym() until it succeeds :)

Okay, the dynamic loader actually only tries the .so's whose names are explicitly mentioned as DT_NEEDED in the .dynamic section but it still is an interesting design choice that the functions being imported are not actually bound to the libraries; you just have a list of shared objects, and a list of functions that those shared objects, in totality, should provide you with.

lokar 1 days ago [-]
Also, don’t use automatic module init, make the user call an init function at startup.

And prefix everything in your library with a unique string.

layer8 1 days ago [-]
What if you use two libraries A and B that both happen to use library C under the hood? Is the application expected to initialize all dependencies in the right order at the top level? Or is library initialization supposed to be idempotent?

This all works as long as libraries are “flat”, but doesn’t scale very well once libraries are built on top of each other and want to hide implementation details.

lokar 1 days ago [-]
The call to init should be idempotent
layer8 1 days ago [-]
That can be difficult in a multi-threaded environment with dynamically loaded shared libraries. Or at least it isn’t something that’s generally expected to be guaranteed to work.
TuxSH 1 days ago [-]
C++ "magic statics" handle that use case (but with hidden atomic flag load (& more) costs at each access)
lokar 1 days ago [-]
Ideally they would do the explicit init at startup before starting threads
eyalitki 1 days ago [-]
Agree, there should be a prefix. But if 2 of my dependencies didn't use a prefix, why is it my fault when I fail to link against them?

Also, some managers object to a prefix within non-api functions, and frankly I can understand them.

kazinator 1 days ago [-]
.a archives can speed up linking of very large software. This is because of assumptions as to the dependencies and the way the traditional Unix-style linker deals with .a files (by default).

When a bunch of .o files are presented to the linker, it has to consider references in every direction. The last .o file could have references to the first one, and the reverse could be true.

This is not so for .a files. Every successive .a archive presented on the linker command line in left-to-right order is assumed to satisfy references only in material to the left of it. There cannot be circular dependencies among .a files and they have to be presented in topologically sorted order. If libfoo.a depends on libbar.a then libfoo.a must be first, then libbar.a.

(The GNU Linker has options to override this: you can demarcate a sequence of archives as a group in which mutual references are considered.)

This property of archives (or of the way they are treated by linking) is useful enough that at some point when the Linux kernel reached a certain size and complexity, its build was broken into archive files. This reduced the memory and time needed for linking it.

Before that, Linux was linked as a list of .o files, same as most programs.

rixed 1 days ago [-]
Do people who write this kind of pieces with such peremptory titles really believe that they finally came about to understand everything better after decades of ignorance?

Chesterton’s Fence yada yada?

dragonwriter 23 hours ago [-]
> Do people who write this kind of pieces with such peremptory titles really believe that they finally came about to understand everything better after decades of ignorance?

No, for the most part, they think that peremptory titles draw readership better, and are using their articles as personal brand marketing.

cap11235 1 days ago [-]
Well, it is on medium.com, so probably yes?
immibis 24 hours ago [-]
Linking works the way it does because of inertia.

There's a pretty broad space of possible ways to link (both static and dynamic). Someone once wrote one of them, and it was good enough, so it stuck, and spread, and now everything assumes it. It's far from the only possible way, but you'd have to write new tooling to do it a different way, and that's rarely worth it. The way they selected is pretty reasonable, but not optimal for some use cases, and it may even be pathological for some.

At least Linux provides the necessary hooks to change it: your PT_INTERP doesn't have to be /lib64/ld-linux-x86-64.so.2

There are many different ways linking can work (both static and dynamic); the currently selected ways are pretty reasonable points in that space, but not the only ones, and can even be pathological in some scenarios.

Actually, a whole lot of things in computing work this way, from binary numbers, to 8-bit bytes, to filesystems, to file handles as a concept, to IP addresses and ports, to raster displays. There were many solutions to a problem, and then one was implemented, and it worked pretty well and it spread, even though other solutions were also possible, and now we can build on top of that one instead of worrying about which one to choose underneath. If you wanted to make a computer from scratch you'd have to decide whether binary is better than decimal, bi-quinary or balanced ternary... or just copy the safe, widespread option. (Contrary to popular belief, very early computers used a variety of number bases other than binary)

pjmlp 15 hours ago [-]
Linking is not done the same way in all platforms or compiled programming languages, if anything most of the inercia has been on FOSS OSes.
immibis 11 hours ago [-]
Really, on which platform is a static library not a bundle of object files?
pjmlp 10 hours ago [-]
Moving goal posts, you point apparently was about linking as general OS process to make libraries, and not static libraries in particular.

So many differences, not every programming language relies on the OS linker UNIX style, many have had the linking process as part of their own toolchain, outside BSDs/Linux, other vendors have had improvements on their object file formats, linking algorithms, implementations of linking processes, delayed linking on demand.

Bytecode based OSes like Burroughs/ClearPath MCP, OS/400 (IBM i), also have their own approach how linking takes place and so forth.

Some OSes like Windows Phone 8.x used a mixed executable format with bytecode and machine symbols, with final linking step during installation on device.

Oberon used slim binaries with compressed ASTs, JIT compiled on load, or plain binaries. The linker was yet another compiler pass.

Many other examples to look into, across systems research.

Far from "Linking works the way it does because of inertia.".

queenkjuul 19 hours ago [-]
Well you'd first have to invent the universe of course
EE84M3i 1 days ago [-]
Something I've never quite understood is why can't you statically link against an so file? What specific information was lost during the linking phase to create the shared object that presents that machine code from being placed into a PIE executable?
sherincall 1 days ago [-]
wcc can do that for you: https://github.com/endrazine/wcc
nly 19 minutes ago [-]
EE84M3i 1 days ago [-]
Woah this is really awesome! Thanks for sharing, this made my day.
krackers 1 days ago [-]
I had this exact question a few months back - https://news.ycombinator.com/item?id=44084781
accelbred 1 days ago [-]
so files require PIC code, which brings along symbol interpolation.
EE84M3i 21 hours ago [-]
Shouldn't you use PIE for executables anyway?
accelbred 21 hours ago [-]
PIE code is different than PIC. PIE can assume no interposition.
EE84M3i 19 hours ago [-]
Sorry what do you mean by "symbol interpolation" and "interposition" in this context?

Natively I would assume you can just take the sections out of the shared object and slap them into the executable. They're both position independent so what's the issue?

If PIE allows greater assumptions to be made by the compiler/linker than PIC that sounds great for performance, but doesn't imply PIC code won't work in a PIE context.

accelbred 17 hours ago [-]
PIC code would work where PIE does, but likely perform worse. In shared libraries, calls to any non-static function can't be assumed to be the same function at runtime, since another library linked or using LD_PRELOAD may also define the symbol. Thus all calls to non-static functions must go through the shared library lookup machinery. This prevents inlining opportunites as well. Functions in an executable can't be overwridden in this manner, and override the symbols in shared libraries. Thus PIE code can have cheaper function calls and freely inline functions.

Its not that you couldn't use the PIC code, but it would be better to just recompile with PIE.

LtWorf 1 days ago [-]
You can, but why?
EE84M3i 1 days ago [-]
At a fundamental level I don't understand why we have two separate file types for static and dynamic libraries. It seems primarily for historical reasons?

The author proposes introducing a new kind of file that solves some of the problems with .a filed - but we already have a perfectly good compiled library format for shared libraries! So why can't we make gcc sufficiently smart to allow linking against those statically and drop this distinction?

Joker_vD 1 days ago [-]
Oh, it's even better on the Windows side of things, at least how the MSVC toolchain does it. You can only link a statically-linked .lib library, period. So if you want to statically link against a dynamic library (what a phrase!), you need to have a special version of that .lib library that essentially is just a collection of thunks (in MSVC-specific format) that basically say "oh, you actually want to add symbol Bar@8 from LIBFOO.DLL to your import section" [0]. So yeah, you'd see three binaries distributed as a result of building a library: libfoo_static.lib (statically-linked library), libfoo.dll (dynamic library), libfoo.lib (the shim library to link against when you want to link to libfoo.dll).

Amusingly, other (even MSVC-compatible) toolchains never had such problem; e.g. Delphi could straight up link against a DLL you tell it to use.

[0] https://learn.microsoft.com/en-us/cpp/build/reference/using-...

Brian_K_White 1 days ago [-]
"So if you want to statically link against a dynamic library (what a phrase!)"

Yes but like an artificially created remarkableness. "dynamic library" should just be "library", and then it's not remarkable at all.

It does seem obvious, and your Delphi example and the other comment wcc example shows, that if an executable can be assembled from .so at run time, then the same thing can also be done at any other time. All the pieces are just sitting there wondering why we're not using them.

alexvitkov 1 days ago [-]
Because with the current compilation model shared libraries (.so/.dll) are the output of the linker, but static libraries are input for the linker. It is historical baggage, but as it currently stands they're fairly different beasts.
19 hours ago [-]
Brian_K_White 1 days ago [-]
This is succint, thank you.
convolvatron 1 days ago [-]
you could say historical reasons, in that dynamic libraries are generated using relocatable position independent code (-pic), which incurs some performance penalty vs code where the linker fills in all the relocations. my guess is thats somewhere around 10%? historical in the sense that that used to be enough to matter? idk that it still is

personally I think leaving the binding of libraries to runtime opens up alot of room for problems, and maybe the savings of having a single copy of a library loaded into memory vs N specialized copies isn't important anymore either.

EE84M3i 21 hours ago [-]
Isn't it best practice now to have all your code be PIC/PIE?
alexvitkov 1 days ago [-]
Because I want my program to run on other people's computers.
tempay 1 days ago [-]
I think the question isn’t why statically link but rather why bother with .a files and instead use the shared libraries all the time (even if only to build a statically linked executable).
alexvitkov 1 days ago [-]
Yeah, I misunderstood the question. Although if you could statically link .so/.dlls and have it work reliably, it would still be a great convenience, as some libraries are really hard to build statically without rewriting half their build system.
o11c 19 hours ago [-]
Then just use an rpath with `${ORIGIN}` and ship your .so files along with your binary.

The only time where you shouldn't do this is if your executable requires setuid or similar permission bits, but those generally can only reasonably be shipped through distro repos.

tux3 1 days ago [-]
I actually wrote a tool a to fix exactly this asymmetry between dynamic libraries (a single object file) and static libraries (actually a bag of loose objects)

I never really advertised it, but what it does is take all the objects inside your static library, and tells the linker to make a static library that contains a single merged object.

https://github.com/tux3/armerge

The huge advantage is that with a single object, everything works just like it would for a dynamic library. You can keep a set of public symbols and hide your private symbols, so you don't have pollution issues.

Objects that aren't needed by any public symbol (recursively) are discarded properly, so unlike --whole-archive you still get the size benefits of static linking.

And all your users don't need to handle anything new or to know about a new format, at the end of the day you still just ship a regular .a static library. It just happens to contain a single object.

I think the article's suggestion of a new ET_STAT is a good idea, actually. But in the meantime the closest to that is probably to use ET_REL, a single relocatable object in a traditional ar archive.

amluto 1 days ago [-]
Is there any actual functional difference between the author’s proposed ET_STAT and an appropriately prepared ET_RET file?

For that matter, I’ve occasionally wondered if there’s any real reason you can’t statically link an ET_DYN (.so) file other than lack of linker support.

tux3 1 days ago [-]
I think everything that you would want to do with an ET_STAT file is possible today, but it is a little off the beaten path, and the toolchain command line options today aren't as simple as for dynamic libraries (e.g. figuring out how to hide symbols in a relocatable object is completely different on the GNU toolchain, LLVM on Linux, or Apple-LLVM which also supports relocatable objects, but has a whole different object file format).

I would also be very happy to have one less use of the legacy ar archive format. A little known fact is that this format is actually not standard at all, there's several variants floating around that are sometimes incompatible (Debian ar, BSD ar, GNU ar, ...)

stabbles 1 days ago [-]
It sounds interesting, but I think it's better if a linker could resolve dependencies of static libraries like it's done with shared libraries. Then you can update individual files without having to worry about outdated symbols in these merged files.
tux3 1 days ago [-]
If you mean updating some dependency without recompiling the final binary, that's not possible with static linking.

However the ELF format does support complex symbol resolution, even for static objects. You can have weak and optional symbols, ELF interposition to override a symbol, and so forth.

But I feel like for most libraries it's best to keep it simple, unless you really need the complexity.

dzaima 1 days ago [-]
How possible would it be to have a utility that merges multiple .o files (or equivalently a .a file) into one .o file, via changing all hidden symbols to local ones (i.e. alike C's "static")? Would solve the private symbols leaking out, and give a single object file that's guaranteed to link as a whole. Or would that break too many assumptions made by other things?
Joker_vD 1 days ago [-]
Like, a linker, with "objcopy --strip-symbols" run as the post-step? I believe you can do this even today.
dzaima 1 days ago [-]
--localize-hidden seems to be more what I was thinking of. So this works:

    ld --relocatable --whole-archive crappy-regular-static-archive.a -o merged.o
    objcopy --localize-hidden merged.o merged.o
This should (?) then solve most issues in the article, except that including the same library twice still results in an error.
reactordev 1 days ago [-]
I did this with my dependencies for my game engine. Built them all as libs and used linker to merge them all together. Makes building my codebase as easy as -llibutils
krackers 1 days ago [-]
It seems like is precisely what the other commenter implemented? https://news.ycombinator.com/item?id=44645423
benreesman 1 days ago [-]
I routinely tear apart badly laid-out .a files and re-ar them into something useful. It's a few lines of bash.
tux3 1 days ago [-]
This works, but scripting with the ar tool is annoying because it doesn't handle all the edge cases of the .a format.

For instance if two libraries have a source file foo.c with the same name, you can end up with two foo.o, and when you extract they override each other. So you might think to rename them, but actually this nonsense can happen with two foo.o objects in the same archive.

The errors you get when running into these are not fun to debug.

benreesman 1 days ago [-]
This is the nastiest one in my `libmodern-cpp` suite: https://gist.github.com/b7r6/31a055e890eaaa9e09b260358da897b....

It took a few minutes, probably has a few edge cases I haven't banged out yet, and now I get to `-l` and I can deploy with `rsync` instead of fucking Docker or something.

I take that deal.

benreesman 1 days ago [-]
`boost` is a little sticky too: https://gist.github.com/b7r6/e9d56c0f6d55bc0620b2ce190e15d44...

but for your trouble: https://gist.github.com/b7r6/0cc4248e24288551bcc06281c831148...

If there's interest in this I can make a priority out of trying to get it open-sourced.

tux3 1 days ago [-]
Yes, boost is one of those that gave me the biggest trouble as well.

I feel like we really need better toolchains in the first place. None of this intrinsically needs to be made complex, it's all a lack of proper support in the standard tools.

benreesman 1 days ago [-]
It's not though, as you can see from building two of the most notoriously nasty libraries on God's earth, emitting reasonable `pkg-config` is trivial, it's string concatenation.

The problem is misaligned incentives: CMake is bad, but it was sort of in the right place at the right time and became a semi-standard, and it's not in the interests of people who work in the CMake ecosystem to emit correct standard artifact manifests.

Dynamic linking by default is bad, but the gravy train on that runs from Docker to AWS to insecure-by-design TLS libraries.

The fix is for a few people who care more about good computing than money or fame to do simple shit like I'm doing above and make it available. CMake will be very useful in destroying CMake: it already encodes the same information that correct `pkg-config` needs.

cryptonector 1 days ago [-]
It's not that .a files and static linking are a relic, but that static linking never evolved like dynamic linking did. Static linking is stuck with 1978 semantics, while dynamic linking has grown features that prevent the mess that static linking made. There are legit reasons for wanting static linking in 2025, so we really ought to evolve static linking like we did dynamic linking.

Namely we should:

  - make -l and -rpath options in
    .a generation do something:
    record that metadata in the .a

  - make link-edits use that meta-
    data recorded in .a files in
    the previous item
I.e., start recording dependency metadata in .a files and / so we can stop flattening dependency trees onto the final link-edit.

This will allow static linking to have the same symbol conflict resolution behaviors as dynamic linking.

stabbles 1 days ago [-]
Much of the dynamic section of shared libraries could just be translated to a metadata file as part of a static library. It's not breaking: the linker skips files in archives that are not object files.

binutils implemented this with `libdep`, it's just that it's done poorly. You can put a few flags like `-L /foo -lbar` in a file `__.LIBDEP` as part of your static library, and the linker will use this to resolve dependencies of static archives when linking (i.e. extend the link line). This is much like DT_RPATH and DT_NEEDED in shared libraries.

It's just that it feels a bit half-baked. With dynamic linking, symbols are resolved and dependencies recorded as you create the shared object. That's not the case when creating static libraries.

But even if tooling for static libraries with the equivalent of DT_RPATH and DT_NEEDED was improved, there are still the limitations of static archives mentioned in the article, in particular related to symbol visibility.

dale_glass 1 days ago [-]
Oh, static linking can be lots of "fun". I ran into this interesting issue once.

1. We have libshared. It's got logging and other general stuff. libshared has static "Foo foo;" somewhere.

2. We link libshared into libfoo and libbar.

3. libfoo and libbar then go into application.

If you do this statically, what happens is that the Foo constructor gets invoked twice, once from libfoo and once from libbar. And also gets destroyed twice.

rramadass 16 hours ago [-]
But this is expected behaviour. The Linker cannot know about your intent but is "dumb" in that it only follows some simple rules. Both libfoo and libbar have their own copy of the .o from libshared containing the "Foo foo" instance. Thus the .init/.fini sections in libfoo and libbar make calls to the ctor/dtor of their own "Foo foo" instances resulting in the observed two calls in the app.

The way people generally solve this problem is by using a helper class in the library header file which does reference counting for proper initialization/destruction of a single global instance. For an example see std::ios_base::Init in the standard C++ library - https://en.cppreference.com/w/cpp/io/ios_base/Init

To understand the basics of how linking (both static and dynamic) works see;

1) Hongjiu Lu's ELF: From the Programmer's Perspective - https://ftp.math.utah.edu/u/ma/hohn/linux/misc/elf/elf.html

2) Ian Lance Taylor's 20-part linker essay on his blog; ToC here - https://lwn.net/Articles/276782/

kazinator 1 days ago [-]
> Yet, what if the logger’s ctor function is implemented in a different object file? Well, tough luck. No one requested this file, and the linker will never know it needs to link it to our static program. The result? crash at runtime.

If you have spontaneously called initialization functions as part of an initialization system, then you need to ensure that the symbols are referenced somehow. For instance, a linker script which puts them into a table that is in its own section. Some start-up code walks through the table and calls the functions.

This problem has been solved; take a look at how U-boot and similar projects do it.

This is not an archive problem because the linker will remove unused .o files even if you give it nothing but a list of .o files on the command line, no archives at all.

flohofwoe 1 days ago [-]
Library files are not the problem, deploying an SDK as precompiled binary blobs is ;)

(I bet that .a/.lib files were originally never really meant for software distribution, but only as intermediate file format between a compiler and linker, both running as part of the same build process)

eyalitki 1 days ago [-]
Yeah, but when the product is an SDK, and customers develop on top of it (using their own toolchains) there isn't a lot left for me to play with.
triknomeister 1 days ago [-]
SDK could ship the source, lol, stop kneecapping your consumers.
harryvederci 1 days ago [-]
Minor suggestion: the article refers to a RHEL 6 developer guide section about static linking. Maybe a more recent article can be used (if their viewpoint hasn't changed).
jhallenworld 1 days ago [-]
On the private symbol issue... there is probably a solution to this already. You can partially link a bunch of object files into a single object file (see ld -r). After this is done, 'strip' the file except for those symbols marked with non-hidden visibility- I've not tried to do this, maybe 'strip -x' does the right thing? Not sure.
eyalitki 1 days ago [-]
1. "Advanced" compilation environments (meson) probably limit this ability to some extent. 2. Package managers (rpmbuild for instance) mandate build with debug symbols and they do the strip on their own so to create the debug packages. This limits our control of these steps.
layer8 1 days ago [-]
> Something like a “Static Bundle Object” (.sbo) file, that will be closer to a Shared Object (.so) file, than to the existing Static Archive (.a) file.

Is there something missing from .so files that wouldn’t allow them to be used as a basis for static linking? Ideally, you’d only distribute one version of the library that third parties can decide to either link statically or dynamically.

dwattttt 23 hours ago [-]
Shared libraries are linked together in a lossy step. I don't believe it's theoretically impossible; as an unsatisfying proof of concept, you could 'statically' link the .so by archiving it in the final binary, unpacking it at runtime, and dynamically linking it.

The static linker would be prevented from seeing multiple copies of code too.

parpfish 1 days ago [-]
relic isn't the right word.

relics are really old things that are revered and honored.

i think they just want archaic which are old things that are likely obsolete

Biganon 8 hours ago [-]
vestige
benreesman 1 days ago [-]
It is unclear to me what the author's point is. Its seems to center on the example of DPDK being difficult to link (and it is a bear, I've done it recently).

But its full of strawmen and falsehoods, the most notable being the claims about the deficienies of pkg-config. pkg-config works great, it is just very rarely produced correctly by CMake.

I have tooling and a growing set of libraries that I'll probably open source at some point for producing correct pkg-config from packages that only do lazy CMake. It's glorious. Want abseil? -labsl.

Static libraries have lots of game-changing advantages, but performance, security, and portability are the biggest ones.

People with the will and/or resources (FAANGs, HFT) would laugh in your face if you proposed DLL hell as standard operating procedure. That shit is for the plebs.

It's like symbol stripping: do you think maintainers trip an assert and see a wall of inscrutable hex? They do not.

Vendors like things good for vendors. They market these things as being good for users.

sp1rit 1 days ago [-]
> Static libraries have lots of game-changing advantages, but performance, security, and portability are the biggest ones.

No idea how you come to that conclusion, as they are definitively no more secure than shared libraries. Rather the opposite is true, given that you (as end user) are usually able to replace a shared library with a newer version, in order to fix security issues. Better portability is also questionable, but I guess it depends on your definition of portable.

l72 1 days ago [-]
I think from a security point of view, if a program is linked to its library dynamically, a malicious actor could replace the original library without the user noticing, by just setting the LD_LIBRARY_PATH to point to the malicious library. That wouldn't be possible with a program that is statically linked.
benreesman 1 days ago [-]
And unless you're in one of those happy jurisdictions where digital rights are respected, that malicious threat actor could range from a mundane cyber criminal to and advanced persistent threat, and that advanced persistent threat could trivially be your own government. Witness, the only part of `glibc` that really throws a fit if you yank it's ability to get silently replaced via `soname` is DNS resolution.
Brian_K_White 1 days ago [-]
You act as though the sales pitch for dynamically loaded shared libs is the whole story.

Obviously everything has some reason it was ever invented, and so there is a reason dynamic linking was invented too, and so congratulations, you have recited that reason.

A trivial and immediate counter example though is that a hacker is able to replace your awesome updated library just as easily with their own holed one, because it is loaded on the fly at run-time and the loading mechanism has lots of configurability and lots of attack surface. It actually enables attacks that wouldn't otherwise exist.

And a self contained object is inherently more portable than one with dependencies that might be either missing or incorrect at run time.

There is no simple single best idea for anything. There are various ideas with their various advantages and disadvantages, and you use whichever best services your priorities of the moment. The advantages of dynamic libs and the advantages of static both exist and sometimes you want one and sometimes you want the other.

rramadass 17 hours ago [-]
A hacker can easily replace your shared library with their own malicious version or intercept calls into one as needed. As the number of distinct binary blobs for an application increases, the surface area for attack vectors increases making security a nightmare. Every piece also needs to be individually signed and authenticated adding more complexity to the application deployment.

As the gp mentioned, Static libraries have a lot of advantages by having only one binary to sign, authenticate and lockdown/test/prove the public interface. The idea is extended into the "Unikernel" approach where even the OS becomes part of the single binary which is then deployed to bare-metal (embedded systems) or a Hypervisor.

benreesman 1 days ago [-]
Knowing what code runs when i invoke an executable or grant it permissions is a fucking prerequisite for any kind of fucking security.

Portability is to any fucking kernel in a decade at the ABI level. You dont sound stupid, which means youre being dishonest. Take it somewhere else before this gets okd school Linus.

I have no fucking patience when it comes to eirher Drepper and his goons or the useful idiots parroting that tripe at the expense of less technical people.

edit: I don't like losing my temper anywhere, especially in a community where I go way back. I'd like to clarify that I see this very much in terms of people with power (technical sophistication) and their relationship to people who are more vulnerable (those lacking that sophistication) in matters of extremely high stakes. The stakes at the low end are the cost and availability of computing. The high end is as much oppressive regime warrantless wiretap Gestapo shit as you want to think about.

Hackers have a responsibility to those less technical.

Orphis 1 days ago [-]
pkg-config works great in limited scenarios. If you try to do anything more complex, you'll probably run into some complex issues that require modifying the supplied .pc files from your vendor.

There's is a new standard that is being developed by some industry experts that is aiming to address this called CPS. You can read the documentation on the website: https://cps-org.github.io/cps/ . There's a section with some examples as to why they are trying to fix and how.

benreesman 1 days ago [-]
`pkg-config` works great in just about any standard scenario: it puts flags on a compile and link line that have been understood by every C compiler and linker since the 1970s.

Here's Bazel consuming it with zero problems, and if you have a nastier problem than a low-latency network system calling `liburing` on specific versions of the kernel built with Bazel? Stop playing.

The last thing we need is another failed standard further balkanizing an ecosystem that has worked fine if used correctly for 40+ years. I don't know what industry expert means, but I've done polyglot distributed builds at FAANG scale for a living, so my appeal to authority is as good as anyone's and I say `pkg-config` as a base for the vast majority of use cases with some special path for like, compiling `nginx` with it's zany extension mechanism is just fine.

https://gist.github.com/b7r6/316d18949ad508e15243ed4aa98c80d...

Orphis 1 days ago [-]
Have you read the rationale about CPS? It gives clear examples as to why it doesn't work. You need to parse the files and then parse all the compiler and linker arguments in order to understand what to do with those to properly consume them.

What do you do if you use a compiler or linker that doesn't use the same command line parameters as they are written in the pc file? What do you do when different packages you depend on have conflicting options, for example one depending against different C or C++ language versions?

It's fine in a limited and closed environment, it does not work for proper distribution, and your Bazel rules prove it as it is not working in all environments clearly. It does not work with MSVC style flags, or handles include files well (hh, hxx...). Not saying it can't be fixed, but that's just a very limited integration, which proves the point of having a better format for tool consumption.

And you're not the only one who has worked in a FAANG company around and dealt with large and complex build graphs. But for the most part, FAANGs don't all care about consuming pkg-config files, most will just rewrite the build files for Blaze / Bazel (or Buck2 from what I've heard). Very few people want to consume binary archives as you can't rebuild with the new flavor of the week toolchain and use new compiler optimizations, or proper LTO etc.

benreesman 23 hours ago [-]
Yeah, I read this:

"Although pkg-config was a huge step forward in comparison to the chaos that had reigned previously, it retains a number of limitations. For one, it targets UNIX-like platforms and is somewhat reliant on the Filesystem Hierarchy Standard. Also, it was created at a time when autotools reigned supreme and, more particularly, when it could reasonably be assumed that everyone was using the same compiler and linker. It handles everything by direct specification of compile flags, which breaks down when multiple compilers with incompatible front-ends come into play and/or in the face of “superseded” features. (For instance, given a project consuming packages “A” and “B”, requiring C++14 and C++11, respectively, pkg-config requires the build tool to translate compile flags back into features in order to know that the consumer should not be built with -std=c++14 ... -std=c++11.)

Specification of link libraries via a combination of -L and -l flags is a problem, as it fails to ensure that consumers find the intended libraries. Not providing a full path to the library also places more work on the build tool (which must attempt to deduce full paths from the link flags) to compute appropriate dependencies in order to re-link targets when their link libraries have changed.

Last, pkg-config is not an ideal solution for large projects consisting of multiple components, as each component needs its own .pc file."

So going down the list:

- FHS assumptions: false, I'm doing this on NixOS and you won't find a more FHS-hostile environment

- autotools era: awesome, software was better then

- breaks with multiple independent compiler frontends that don't treat e.g. `-isystem` in a reasonable way? you can have more than one `.pc` file, people do it all the time, also, what compilers are we talking about here? mingw gcc from 20 years ago?

- `-std=c++11` vs. `-std=c++14`? just about every project big enough to have a GitHub repository has dramatically bigger problems than what amounts to a backwards-compatible point release from a decade ago. we had a `cc` monoculture for a long time, then we had diversity for a while, and it's back to just a couple of compilers that try really hard to understand one another's flags. speaking for myself? in 2025 i think it's good that `gcc` and `clang` are fairly interchangeable.

So yeah, if this was billed as `pkg-config` extensions for embedded, or `pkg-config` extensions for MSVC, sure. But people doing non-gcc, non-clang compatible builds already know they're doing something different, price you pay.

This is the impossible perfect being the enemy of the realistic great with a healthy dose of "industry expertise". Do some conventions on `pkg-config`.

The alternative to sensible builds with working tools we have isn't this catching on, it won't. The alternative is CMake jank in 2035 just like 2015 just like now.

edit: brought to us by KitWare, yeah fuck that. KitWare is why we're in this fucking mess.

eyalitki 1 days ago [-]
If someone needs a wrapper for a technology, that modifies the output it provides (like meson and bazel do), maybe there is an issue with said technology.

If pkg-config was never meant to be consumed directly, and was always meant to be post processed, then we are missing this post processing tool. Reinventing it in every compilation technology again and again is suboptimal, and at least Make and CMake do not have this post processing support.

benreesman 1 days ago [-]
This was the point of posting the trivial little `pkg.bzl` above: Bazel doesn't need to do all this crazy stuff in `rules_cc` and `rules_foreign_cc`: those are giant piles of internal turf wars within the Blaze team that have spilled onto GitHub.

The reason why we can't have nice things is that nice things are simple and cheap and there's no money or prestige in it. `zig cc` demonstrates the same thine.

That setup:

1. mega-force / sed / patch / violate any build that won't produce compatible / standard archives: https://gist.github.com/b7r6/16f2618e11a6060efcfbb1dbc591e96...

2. build sane pkg-config from CMake vomit: https://gist.github.com/b7r6/267b4401e613de6e1dc479d01e795c7...

3. profit

delivers portable (trivially packages up as `.deb` or anything you want), secure (no heartbleed 0x1e in `libressl`), fast (no GOT games, other performance seppeku) builds. These are zero point zero waste: fully artifact cached at the library level, fully action cached at the source level, fully composable, supporting cross-compilation and any standard compiler.

I do this in real life. It's a few hundred lines of nix and bash. I'm able to do this because I worked on Buck and shit, and I've dealt with Bazel and CMake for years, and so I know that the stuff is broken by design, there is no good reason and no plan beyond selling consulting.

This complexity theatre sells shit. It sure as hell doesn't stop security problems or Docker brainrot or keep cloud bills down.

ethin 1 days ago [-]
The only exception to this general rule (which, to be clear, I agree with) is when your code for whatever links to LGPL licensed code. A project I'm a major contributor of does this (we have no choice but to use these libraries, due to the requirements we have, though we do it via implib.so (well, okay, the plan is to do that)), and so dynamic linking/DLL hell is the only path we are able to take. If we link statically to the libraries, the LGPL pretty much becomes the GPL.
benreesman 1 days ago [-]
Sure, there are use cases. Extensions to e.g. Python are a perfectly reasonable usecase for `dlopen` (hooking DNS on all modern Linux is...probably not for our benefit).

There are use cases for dynamic linking. It's just user-hostile as a mandatory default for a bunch of boring and banal reasons: KitWare doesn't want `pkg-config` to work because who would use CMake if they had straightforward alternatives. The Docker Industrial complex has no reason to exist in a world where Linus has been holding the line of ABI compatibility for 30 years.

Dynamic linking is fine as an option, I think it's very reasonable to ship a `.so` alongside `.a` and other artifacts.

Forcing it on everyone by keeping `pkg-config` and `musl` broken is a more costly own goal for computing that Tony Hoare's famous billion dollar mistake.

throwawayffffas 1 days ago [-]
Couldn't agree more with you the whole reason docker exists is to avoid having to deal with dynamic libraries we package the whole userland and ship it just to avoid dealing with different dynamic link libraries across systems.
benreesman 1 days ago [-]
Right, the popularity of Docker is proof of what users want.

The implementation of Docker is proof of how much money you're expected to pay Bezos to run anything in 2025.

KWxIUElW8Xt0tD9 1 days ago [-]
yes DLL hell is the issue with dynamic linking -- how many versions of given libraries are required for the various apps you want to install? -- and then you want to upgrade something and it requires yet another version of some library -- there is really no perfect solution to all this
benreesman 1 days ago [-]
You reconcile a library set for your application. That's happening whether you realize it or not, and whether you want to or not.

The question is, do you want it to happen under your control in an organized way that produces fast, secure, portable artifacts, or do you want it to happen in some random way controlled by other people at some later date that will probably break or be insecure or both.

There's an analogy here to systems like `pip` and systems with solvers in them like `uv`: yeah, sometimes you can call `pip` repeatedly and get something that runs in that directory on that day. And neat, if you only have to run it once, fine.

But if you ship that, you're externalizing the costs to someone else, which is a dick move. `uv` tells you on the spot that there's no solution, and so you have to bump a version bound to get a "works here and everywhere and pretty much forever" guarantee that's respectful of other people.

uecker 1 days ago [-]
Isn't this what partial linking is for, combining object files into a larger one?
SanjayMehta 17 hours ago [-]
Unix originated on the PDP-11, a machine with very limited memory and disk space. At that time, this was not only the right solution, it was probably the only solution.

Calling it “a bad idea all along” is undeserved.

high_na_euv 1 days ago [-]
.so .o .a .pc holy shit, what a mess

Why things that are solved in other programming ecosystems are impossible in c cpp world, like sane building system

adev_ 1 days ago [-]
> Why things that are solved in other programming ecosystems are impossible in c cpp world, like sane building system

This is such an ignorant comment.

Most other natively compiled languages have exactly the same concept behind: Object files, Shared Libraries, collection of object and some kind of configuration description of the compilation pipeline.

Even high level languages like Rust has that (to some extend).

The fact it is buried and hidden under 10 layers of abstraction and fancy tooling for your language does not mean it does not exist. Most languages currently do rely on the LLVM infrastructure (C++) for the linker and their object model anyway.

The fact you (probably) never had to manipulate it directly just mean your higher level superficial work never brought you deep enough where it starts to be a problem.

high_na_euv 1 days ago [-]
>The fact you (probably) never had to manipulate it directly just mean your higher level work never brought you deep enough where it starts to be a problem.

Did you just agree with me that other prog. ecosystems solved the building system challenge?

trinix912 1 days ago [-]
They solved it by building on top of what C (or LLVM, to be precise) does, not by avoiding or replacing it.

What should C do to solve it? Add another layer of abstraction on top of it? CMake does that and people complain about the extra complexity.

high_na_euv 1 days ago [-]
Bro, cmake is garbage. Having to debug cmske scripts is pain

Also how e.g dotnet builds on top of c or llvm?

adev_ 1 days ago [-]
> Did you just agree with me that other prog. ecosystems solved the building system challenge?

Putting the crap is a box with a user friendly handle on it to make it look 'friendlier' is never 'solving a problem'.

It is barely hiding the dust under the carpet.

uecker 1 days ago [-]
I think people today often do not understand any more that C like many other things in the UNIX world is a tool, not a complete framework. But somehow people expect a complete convenient framework with batteries included. They see it has a deficiency that C by itself does not provide many things. I see it as one of its major strengths and this is one of the reasons why I prefer it.
tester756 1 days ago [-]
Sane building system is pretty much basic thing in modern language

You'd expect something as mature as C to have such a thing by default

uecker 16 hours ago [-]
Thanks for confirming my point ;-)
sparkie 1 days ago [-]
Because those other ecosystems assume that someone has already done the work on the base system and libraries that they don't have to worry about them, and can focus purely on their own little islands.
pjmlp 1 days ago [-]
More like, in other ecosystems, especially the compiled languages that weren't born as part of UNIX like C and C++, the whole infrastructure also takes building and linking as part of the whole language.

Note that ISO C and ISO C++ ignore the existence of compilers, linkers and build tools, as per legalese there is some magic way how the code gets turned into machine code, the standards don't even consider the existence of filesystems on header files and translation units locations, they are talked about in the abstract, and can in all standard compliant way be stored in a SQL database.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 22:24:38 GMT+0000 (Coordinated Universal Time) with Vercel.