NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
ArXiv Declares Independence from Cornell (science.org)
frankling_ 25 minutes ago [-]
The recent announcement to reject review articles and position papers already smelled like a shift towards a more "opinionated" stance, and this move smells worse.

The vacuum that arXiv originally filled was one of a glorified PDF hosting service with just enough of a reputation to allow some preprints to be cited in a formally published paper, and with just enough moderation to not devolve into spam and chaos. It has also been instrumental in pushing publishers towards open access (i.e., to finally give up).

Unfortunately, over the years, arXiv has become something like a "venue" in its own right, particularly in ML, with some decently cited papers never formally published and "preprints" being cited left and right. Consider the impression you get when seeing a reference to an arXiv preprint vs. a link to an author's institutional website.

In my view, arXiv fulfills its function better the less power it has as an institution, and I thus have exactly zero trust that the split from Cornell is driven by that function. We've seen the kind of appeasement prose from their statement and FAQ [1] countless times before, and it's now time for the usual routine of snapshotting the site to watch the inevitable amendments to the mission statement.

"What positive changes should users expect to see?" - I guess the negative ones we'll have to see for ourselves.

[1] https://tech.cornell.edu/arxiv/

halperter 2 hours ago [-]
reed1234 2 hours ago [-]
Should be the main link. The original article is based on the CEO job posting.
asimpleusecase 5 minutes ago [-]
I wonder if there are plans to licence the content for AI training
psalminen 2 hours ago [-]
I might be missing something, but I still don't get the why. I don't see any "problem" that needs to be solved.
kolinko 58 minutes ago [-]
The article lists the reasons quite clearly.
binsquare 21 minutes ago [-]
For everyone else,

The reason is because arxiv is growing significantly leading to 297,000 deficit in operating costs for 2025 alone. Corenell has helped with donation a long with other organizations that pay membership fees.

As a result, donors + leaders of arxiv think it's best to spin off to increase funding.

u1hcw9nx 46 minutes ago [-]
I think the problem described in 6th paragraph needs to be solved.
Garlef 27 minutes ago [-]
Maybe they should implement a graph based trust system:

You need your favourite academic gatekeeper (= thesis advisor) to vouch for you in order to be allowed to upload.

Then AI slop gets flagged and the shame spreads through the graph. And flaggings need to have evidence attached that can again be flagged.

dmos62 17 minutes ago [-]
I've often thought that similar trust systems would work well in social media, web search, etc., but I've never seen it implemented in a meaningful way. I wonder what I'm missing.
IshKebab 13 minutes ago [-]
Lobsters has this I think. But it also means I've never posted there.
Peteragain 50 minutes ago [-]
.. and soon to be dependent on US military funding? Controlled by someone who has run-ins with universities? This'll end in tears.
dataflow 3 hours ago [-]
This sounds terrible. Of course there's a huge risk of it becoming made for-profit. It almost makes you wonder if the academic publishers are behind this push somehow.

Could they not have made it into some legal structure that puts universities at the top? Say, with a bunch of universities owning shares that comprise the entirety of the ownership of arXiv, but that would allow arXiv to independently raise funds?

gucci-on-fleek 2 hours ago [-]
> Of course there's a huge risk of it becoming made for-profit.

The article says that "it will become an independent nonprofit corporation", and as OpenAI's failed attempt showed, converting a non-profit to a for-profit organization is either really hard or impossible.

> Could they not have made it into some legal structure that puts universities at the top?

As a corporation (even a non-profit one), it will have a board of directors. I have no idea what their charter will look like, but I would be surprised if at least one seat wasn't reserved for a university representative, and more than that seems quite likely as well.

MostlyStable 2 hours ago [-]
OpenAI didn't get everything that they wanted, but I very much disagree with calling it a "failed attempt". The non-profit went from owning the entirety of OpenAI to having ~25% stake.
ronsor 2 hours ago [-]
Sam Altman is a special kind of person; not many could pull off the schemes he does.
gentleman11 2 hours ago [-]
I doubt it was him who architected it. A team of lawful evil lawyers more likely
gucci-on-fleek 2 hours ago [-]
Ah, thanks for the correction.
adamnemecek 3 hours ago [-]
Good call, ArXiv seems like one of the most important institutions out there right now.
koakuma-chan 57 minutes ago [-]
it just hosts pdfs, no?
freehorse 32 minutes ago [-]
Well, technically, it can also compile your tex file if you upload the tex file instead of the pdf directly, which helps a lot in standardizing the stylistic structure between preprints. Most other repositories are wild west and inconsistent. I really appreciate the similarity in style applied to most preprints there. Moreover, this means you can also download not just the pdf, but the source tex file to, which can be very useful.
aragilar 37 minutes ago [-]
It does do a fair amount of filtering of submissions, and it's a long term archive (e.g. for the next 100+ years). I suspect both (but with the former dominating) are the issue.
pfortuny 37 minutes ago [-]
Also the sources and has a very tame but useful pre-acceptance process.
p-e-w 2 hours ago [-]
It’s so important, in fact, that there should be more than one such institution.

People keep falling into the same trap. They love monopolies, then are shocked when those monopolies jerk them around.

freehorse 36 minutes ago [-]
It is just a preprint repository. It is pretty open (the stories where a preprint was rejected or delayed unreasonably are extremely rare). It offers the basic services for a math/compsci/physics themed preprint repository.

I don't see much of a monopoly, nor any "moat" apart from it being recognised. You can already post preprints on a personal website or on github, and there are "alternatives" such as researchgate that can also host preprints, or zenodo. There are also some lesser known alternatives even. I do not see anything special in hosting preprints online apart from the convenience of being able to have a centralised place to place them and search for them (which you call "monopoly"). If anything, the recognisability and centrality of arxiv helped a lot the old, darker days to establish open access to papers. There was a time when many journals would not let you publish a preprint, or have all kinds of weird rules when you can and when you can't. Probably still to some degree.

auggierose 2 hours ago [-]
I am using Zenodo for a while now instead. It is more user friendly, as well.
mastermage 49 minutes ago [-]
Zenodo is more for IT Papers and also datasets isn't it?
auggierose 34 minutes ago [-]
It can host large datasets as well, yes. It is hosted by CERN, so it is not specifically IT in any way. It also allows you to restrict access to the files of your submission. It has no requirements to submit your LaTeX sources, any PDF will be fine. There are also no restrictions on who can publish. You'll get a DOI, of course.

Everything published on arXiv could also be published on Zenodo, but not the other way around.

andbberger 2 hours ago [-]
there is. bioarxiv.
tornikeo 2 hours ago [-]
Now the question is, will arxiv wage a decade long bloody war with Cornell, using heavy infantry (PhD students), archers (reviewers) and field artillery (AI slop papers), or will the independence be mostly peaceful? Only time can tell.
alansaber 2 hours ago [-]
PhD students are levy infantry at best with Postdocs being the armoured levies.
dmos62 15 minutes ago [-]
Is this Gondor or Mordor?
OutOfHere 56 minutes ago [-]
With 300K for the CEO, its enshittification will commence imminently. It will now serve to maximize revenue. Just wait and watch while they issue a premium membership, payment requirements for authors, and other revenue generators to please their investors.
exe34 51 minutes ago [-]
they'll just turn into a shitty journal at this point, they just need to introduce peer review and they can start competing with the real journals on price point.

another will need to rise to take its place.

OutOfHere 43 minutes ago [-]
> they'll just turn into a shitty journal at this point

To this end, they added an endorsement requirement this year: https://blog.arxiv.org/2026/01/21/attention-authors-updated-...

unit149 2 hours ago [-]
[dead]
tgtracing 2 hours ago [-]
[dead]
davnicwil 2 hours ago [-]
Very unrelated to the article, but I think 'arXiv' as a brand is bad, and really detrimental to what the institution aims to accomplish.

That is, it's not readily parseable, it really gives an insider term vibe - like this isn't for you if you don't already know what it means or how you should read or say it. It sort of reminds me of the overuse of latin and latinate terms generally in the old professions and, well, the academy.

Just always struck me as being somewhat at odds with the goal.

john-titor 1 hours ago [-]
I wonder what makes you feel that. I've been publishing preprints close to a decade on arxiv now and never had any particular feelings about it.

To me it's just a way to get out your work fast, so that there is already a trace of it on the Internets - nothing more and nothing less.

> That is, it's not readily parseable, it really gives an insider term vibe...

Isn't that normal with highly specialized research fields? I agree many papers could benefit from clearer wording, but working in a niche means you sometimes don't reach a broader audience

davnicwil 1 hours ago [-]
It's an opinion, and you feeling no particular way about it is equally valid.

But I did justify and maybe to reword slightly, surely if one of the main drivers is opening up research, the brand name should be something that's less obscure and more accessible / understandable as to what it is on first sight?

Maybe arXiv evoking the word 'archive' with an ancient Greek twist does that for some, but it's clearly a bit cryptic for many, and if the point is to open up probably the brand should just be something much plainer.

aragilar 39 minutes ago [-]
No, it's to be a pre-print server. If someone doesn't know what that means, then they shouldn't be using arXiv.
davnicwil 15 minutes ago [-]
everyone has a first time they see a thing and don't yet know what it is.

Using a brand as a filter where you have to already know what it means to get it is exactly the opposite of what it's supposed to achieve.

Consider the most exclusive (successful) brands that exist. Even there, where exclusivity is a brand goal, none of them have this property of being obscure on first contact.

nixon_why69 1 hours ago [-]
> like this isn't for you if you don't already know what it means

Isn't that actually kindof a good brand signal for a repo of very specialized papers? "Fun with learning" in comic sans wouldn't help credibility.

jltsiren 44 minutes ago [-]
It's a classic story of someone having to pick a name quickly, which then gets established long before anyone who cares about branding is aware of its existence.

The original service didn't even have a name, only a description, and it was amusingly hosted at xxx.lanl.gov. But LANL wasn't really interested in it, and the founder eventually left for Cornell. At that point, the service needed a domain name, but archive.org was already taken.

And besides, the name has Ancient Greek influences. A similar Latinate term might be something like "archive".

davnicwil 8 minutes ago [-]
Interesting, thanks for the context! Makes it more understandable as a choice.
vasco 1 hours ago [-]
This the type of guy that will suggest paper.ly as a better name with a straight face and then we wonder why the internet is turning to shit
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 08:04:06 GMT+0000 (Coordinated Universal Time) with Vercel.