Look through closed tickets in the issue tracker, and try to find a change (bugfix or new feature) that must have, given its nature, touched the functionality you're looking for. Then try to find the changeset(s) where that ticket's change was implemented.
With some luck, the changeset will include a modification to the part of the codebase you've been looking for.
Stuff like that is certainly possible to find. But it requires a lot of time and dedication.
I would personally search for job listings for Chromium / Firefox, then use that knowledge to find someone who works on it. Then I’d ask them where it is.
But only in this specific case. My normal workflow is to build whatever it is I’m looking at, then change things until it breaks. It’s pretty quick to narrow down what I’m looking for at that point.
That doesn’t work here because building chrome requires close to a supercomputer.
EDIT: Actually, I would try to find a crash log related to the DOM. The stack trace will point you precisely where you’re interested in. Doing that is easier said than done, but I’ve pulled that trick a couple times, so it seems worth mentioning.
For example:
* Run chromium using strace, ltrace, gdb to see what's going on at runtime.
* Do some experiments / reverse engineering, treating the application / source code as a black box. Try different HTML input, inspect the DOM in chrome, possibly automate this process via selenium or something, and discover the runtime behavior of the algorithm that way.
The thing to keep in mind is that, for all you know, the DOM building algorithm is split across thousands of source files, or is in fact in some dependency and not in chromium itself, or is split across both. Presumably there is some particular aspect of the DOM building that you are interested in, so experiment with how that works, instead of trying to find / understand the entirety of chromium DOM building.
Breakpoint methodology wins for me simple and true.
I imagine it like pathfinding the minotaurs maze, you stand at the last place you recognise and can get back to (if that's literally the first active line then that's fine), and put something there (breakpoint, print statement, log line), run it and check you still know where you are. Then put another down as far forward from that point that you can 'see', if that's literally one operation step then fine, spin it and check. Breakpoints are easily put down and just as easily cleared back up again. Keep only as many as you need to see which branch you are on.
Pretty soon you'll have run the damn thing so many of times you'll know it's bootstrapping and foibles and they will be second nature. You'll start seeing how it's generally laid out, you'll know where the main start up branchings are. When they leap into async or hidden 'rooms', log lines are perfect.
When the engine of it starts moving in your head, then is the time to start throwing breakpoints, prints or log lines in places that originally were completely unknown but now you have a feeling for. It's at this point you'll be bloody close to where you want to be.
Oh and do future you a favour, at least jot down something as you're going through this. I find that this initial torchlighting is remarkably gratifying but if you don't make notes in six months time it'll be completely gone, and you'll have to do between a quarter to a half of this all over again before the lights start lighting on and you remember how it's laid out.
My way is to have a project. What that is is unimportant, but it needs to be big enough that you run into roadblocks.
Now you know what your weak point is, and what you need to learn to overcome that weak point. So now you also know what it is you need to search for in that code-base.
function C($a, $b, $ba, %bb, $c, $d = NULL) { //insert random garbage with eval statement.. }
You can use filters to narrow down the results to the right languages and paths.
A good source structure tool will save some time but you’re not getting away without doing your own reading anyway.
But to not leave a one word answer, start searching for a feature you know about and look around from where you find it. It might help to add a super small feature yourself - when it finally works, you’ll have some idea of how the code is structured and will be able to infer where other features would live if they were there. (That might take a bit more than one addition ;))
In chromium’s case, what you’re trying to do is more like reverse engineering… I’d start with a debugger.