Hey All,
I would like some feedback on the archive category. but first…
What is the Archive category? It is a new category that will one day contain the ENTIRE old www.chaos.dwarf.com forum content. I have been writing a piece of software that does A LOT to make sure the data stays as consistent as possible.
the tool does the following:
- It takes a part (this is configurable) of ALL the posts ever made on the old forum (0.25 million) and loops through them chronologically.
- It checks if the post’s category already exists on discourse, if not it is created
- it checks if there is a topic for this post, if not it creates the topic (the post then technically is a topic)
- If there is a topic it creates a reply in the topic
→ the message is sanitized, meaning the MyBB markup tags that do not work on discourse are removed/replaced.
→ the message is scanned for ALL links that either refer to the old forum (to another topic or something) OR to images that were hosted on the forum itself. These links are stored in a database for later reprocessing, since some posts do not yet exist at this point.
→ The original forums original username is prefixed to the message for continuity’s sake. Usernames are matched by a High probability algorithm so “Pyro Stick” and “Pyrostick” can be matched. This MIGHT not always be 100% correct.
→ Posts by Users that do not exist on this forum are made by the “System” user.- The creation of the post on discourse is logged in a database so i can link the old post id to the new discourse post at any time AND so i always know what post was last processed.
- Whenever i feel like it, or maybe ill do it automatically after each bulk transfer, or whenever Ill take those url’s that i saved in the database and do a lookup for the Discourse version of that page. If it’s found the post is updated and saved. The record is marked as complete.
→ This will be extended to also upload the missing images that are physically on my pc but only after storage on discourse has been taken into account.
Now… This is a lot of work to check, and not everything is fixable. A dead external link will still be dead. Some MyBB functionality (wiki linking and such) will not work if the old MyBB plugin is used) and so forth.
If there is anyone who has some spare time would be willing to do some comparing and checking of the archive and the old forums version that would be GREAT!
these posts have had some of their URL’s updated by the replacement code.
https://discourse.chaos-dwarfs.com/t/archive-compiled-conversions/2697?u=michaelx
https://discourse.chaos-dwarfs.com/t/archive-compiled-gallery/2699
https://discourse.chaos-dwarfs.com/t/archive-chaos-dwarfs-online-rules-and-forum-primer/2742/1
https://discourse.chaos-dwarfs.com/t/archive-introductions-new-members-post-here/2721/6
https://discourse.chaos-dwarfs.com/t/archive-compiled-conversions/2697/2for example the first Xander link in the fist post, the pulper link in the second post and the links in the last two posts.
Comparing both forums is fairly simple
open the post on Discourse you want to check
On the old forum you would go to the “Modeling Ideas and Advice” category, and look for the “Compiled Conversions” thread (probably on the last page, but a search on the topic name works well too. )
Any and ALL feedback is very welcome!