What we've run into:
1. Admin/owner is nearly / absolutely unreachable, which causes a variety of issues. Mainly we cannot even request a traditional backup of the database underneath the forum software.
2. As with anything, the forums don't get much active engagement other than older forum regulars. However, Google searches easily find useful posts for things like DIY maintenance, modification installs, test data from driving with ECU tunes, track day experiences, etc.
3. It's easy to point people on other social networks at posts by their URL, but due to neglect the website constantly has problems making access increasingly complicated and inconsistent.
Ideally, it'd be nice to find a way to scrape everything as closely as possible into a manageable database.
Even more ideally, if we could convert said scraped data into a format that is easily publishable to a new platform, that would be handy. Even if the new platform is static and simply renders the old threads.
I can't imagine we are the only forum that is experiencing problems like this with most forums probably dying in the last decade.
Has anyone gone through this kind of archival process with vBulletin before?
Thanks.
Assuming you're a member of the organization and therefore licensed to use the content (but merely unable to access it): Purely hypothetically speaking, if an admin is this mia and obviously not on top of the job, the odds are probably high that they've neglected maintenance. Old PHP server running out-of-date PHP applications... not the most secure combo in the world. I wouldn't be surprised if there were some magic strings you could send to the server to get it to regurgitate the contents of the database in a more developer-friendly, strongly-typed fashion which you could import to myBB or XenForo and continue chugging along..
wget --mirror --convert-links --adjust-extension --page-requisites --no-parent --execute robots=off --wait=0.2 --domains example.com https://example.com
1. Scrape it with wget or httrack or similar tool.
2. If the owners not really around it’s probably behind on its security patches, and there’s some relatively recent-ish vB exploits that would let you gain code execution and take a backup the “extremely illegal way” of the entire database, site, etc.
I recommend 1, but 2 is amusing to ponder briefly over a coffee ;)
This will fix relative paths, download assets, etc and can be published as-is on a new site. I'm ignoring copyright questions in the interest of archiving fragile data.
Then I'd use an HTML parser against the local archive to extract the individual posts, if the additional work was justified.
It's single-threaded, alpha-quality software, and still isn't compatible with many forums and themes. But it can export WARCs and may just happen to work for you.
https://github.com/lloydpick/vbulletin
This is a very old tool, it’s hard to say if it will work; then again, seems very relevant too so worst case it could provide an inspiration.
I ran a car forum (sold to VerticalScope 15+ years ago) and it's still chugging along on the same version of PunBB that I had it on when I left, so it seems that even the "experts" haven't found a simple way to migrate between forum softwares
The process is in principle not difficult: scrape the site (I recommend a dedicated scraper for that), then go through and extract everything relevant into a SQL database formatted the way your target forum software expects. The hardest part was recovering BBCode formatting in a usable fashion. Unfortunately my converters were written back when I didn't understand HTML parsing terribly well, so they're a hodgepodge of ugly regexes and handrolled string parsing.
I hosted it again by writing a python script[2] to serve responses from that WARC file again and put it behind nginx with caching enabled.
[1] https://forums.empiresmod.com/index.php
[2] https://gitlab.com/thexa4/warc-server
[2, deb package] https://gitlab.com/thexa4/warc-server/-/jobs/5213679726/arti...
Though this suggestion might not be acceptable in the eyes of many.
If the hosting company is paid they will make and keep a backup for you but under the permissions/access of the original owner.
If the hosting company gets permission to add you as admin to the site from the site owner, who may not be in touch with you, but may respond to the hosting company, then, (since you are paying the hosting company they will be happy to keep you around) you are home free.
1. Scrape it. Plenty of options here.
2. set up a new forum.
I think the current state of the art is Discourse but I could be wrong.
3. automate the recreation of posts and threads with some backend script
(will depend on which sw you picked).
On each post, add a link to the original.
4. Tell everyone about your superduper clone, move the old-timers over.
5. ...
6. Profit.
It sounds like you have some options here. Best of luck.
wget --mirror