While mapping vector embeddings of arXiv abstracts, I came across some surprising clusters that I hadn’t seen before. This post is an attempt to dig deeper into what they might represent.
FYI, the link to the previous blog post is broken. You may also want to add a link to the blog's landing page somewhere near the top.
On the content itself, have you tried to find pairs of abstracts from the two clusters that differ as little as possible along the other dimensions, to see what's different about them?
While mapping vector embeddings of arXiv abstracts, I came across some surprising clusters that I hadn’t seen before. This post is an attempt to dig deeper into what they might represent.
FYI, the link to the previous blog post is broken. You may also want to add a link to the blog's landing page somewhere near the top.
On the content itself, have you tried to find pairs of abstracts from the two clusters that differ as little as possible along the other dimensions, to see what's different about them?
Hello, thank you for pointing out the errors. I have fixed then all.
I have added the plots for y and z dimensions on the blog post itself.
I love the 3D map where there is a big lobe for physics, and another for mostly everything else.
Yes! However CS is taking over and accounted for nearly half of arXiv papers in the years 2024 and 2025 :)
True, but it doesn't have a separate continent.
excellent post -- I turned it into a video :) https://supabase.manatee.work/storage/v1/object/public/video...
Okay, this is mind blowing!