Jump to content

Recommended Posts

So I don't know if this is do-able or not but it's worth giving a shot.


Recently there has been more demand for unexisting hacking tools on the Data extraction thread. (For recent games youcould say)

I'm far from wanting to leave these demands unanswered but I - as a non computer science student - am unfortunately unable YET to help these future translators in hacking their game.


I know a tutorial of that sort was made by user Proger on his blog 4 years ago.


I have many questions for hackers whose answers will be the trigger -or not- to an attempt at making a hacking tutorial on the forum thus hopefully helping our fan translation community just a little more in their remarkable task.


Questions to hackers :


- Is Proger's tutorial correct ? If yes, is it still relevant for current games engines ?


- I know the hacking process is different for every engine, but is there enough common "framework" between the different processes for this tutorial to be useful and exist ?


- If yes, can the knowledge to create an extraction tool be summarized enough for the tutorial not to require months of learning ?


- Would you as a hacker want to help me making that tutorial ?



If the answers to question 2.3.4 are yes, then we'll be on our way to making that tutorial. \o/



When ? No idea  :xmas:

Link to comment
Share on other sites

Here are some of my thoughts which can hopefully answer your questions.


Is Proger's tutorial correct ? If yes, is it still relevant for current games engines ?

I've only skimmed around it, but it appears to be focused towards helping you learn to figure out how a VN engine parses a binary script. The knowledge which is uncovered can serve as the foundation for being able to analyze and edit VN scripts.


In a broad sense, I feel that the tutorial is applicable to current engines, although there are other relevant (and sometimes important) topics that the tutorial does not really cover such as file structures, cryptography, or compression. In plenty of visual novels, you can get stuff extracted easily if they don't use encryption or any custom compression algorithms. If they do, those elements will need to be analyzed and reversed to build an extractor.


In this case, your topic here is geared towards data extraction rather than script hacking. If the reader wasn't quite interested about that, it still serves as a usable intro to get started on (dis)assembly and using debuggers to see how an engine is reading and processing its files. In other words, the tutorial is still useful yet tangentially so.



Is there enough common "framework" between the different processes for this tutorial to be useful and exist ?

In general.... yes.


Making an extractor involves thinking about and inspecting how an engine is taking data in and preparing to use it. Archive formats are designed with a logical structure that you can pick out once you understand the purpose of an archive file. In some cases, you can avoid the need to look inside a game engine; by merely inspecting an archive using a hex editor, some formats can be figured out with some intuition, thus an extractor can be developed just from that knowledge.


If an archive has encrypted/compressed data, or has a totally strange archive structure, then some actual analysis of the engine is probably necessary to figure out the precise details of the format(s) you're trying to examine. This is where you'll have to go in and use some of the methods that were discussed in that tutorial. Again, there are numerous approaches which will lead to the desired result, and it varies from game to game.


Going deeper, knowing how the game uses the OS/platform can greatly aid in examining how the game engine is reading its files. This kind of knowledge can come from doing your own programming, and reading certain materials like API documentation. This might be an obvious one, but MSDN is a valuable reference source for Windows development. And of course, learn how to use your debugger.



If yes, can the knowledge to create an extraction tool be summarized enough for the tutorial not to require months of learning ?

Yes. Learning the basic ideas about data storage can be picked up in a short amount of time. Then looking at archive structures (at least with sensibly designed ones) can be decently easy if you're good at recognizing patterns and important elements.


Moving onward, there are ways to have a working extractor without dropping months into the topic. There are tools like QuickBMS which take a user-made script and use it to extract a particular archive format.... but I have not personally used it a lot, so that's all I know about it.


But still, I think spending the time to learn an actual programming language will turn out to be useful if you learn it with a goal in mind. Writing a file extractor is (in my opinion) something that's a not-too-hard thing to develop as a program. Going in with zero know-how to having something working within a couple days/weeks is possible, certainly. Have the right kind of motivation, and it can take you through a decent amount of the challenges that you'll face.



- Would you as a hacker want to help me making that tutorial ?

.....I'm not opposed to the idea. B)

Link to comment
Share on other sites

Awesome !

I'll contact you before this week end and we'll start discussing the project then :)


Other hackers are still welcome, the more people there will be, the more effective we will be.

Link to comment
Share on other sites

  • 2 weeks later...

I noticed that what you have so far is a guide to the number systems / character systems used in programming, and mentioning some hex editors.


So I guess this is going to be a combination guide & tutorial (the guide being the general information part, and the tutorial being the specific "let's go through an example together" part).  Which is a good idea but maybe there's a way to speed things up:


There are already a ton of well-written guides one google away (for example "binary and hexadecimal number systems").


Which got me to thinking: do you really want to write all this yourselves?


If not, how about a meta-guide: just link to various sites around the web.  Instead of actually having to write it yourself, you only have to consider which items are pertinent and what sites you think explain it clearly.


Or even, how about this: You start a Programming Resources thread, similar to the Japanese Learning Resources thread, where people can suggest links, and ask questions that are related to them.

And then, your actual tutorial would be limited to the part that brings it all together and ties it to a specific visual novel or whatever your example is going to be.

Link to comment
Share on other sites

I'm personnaly don't like tutorials with various links.

I'd rather have all the information I need gathered in one place.

This is just a matter of preferences really.


So there will be theory + example in the tutorial.


And it doesn't bother writing extra stuff, we don't have a time limit to finish this. And when it's something as easy as number system, it's really not that long.


The "Programming Resources Thread' is a good idea. The final thread will be open to suggestions and questions.


However the tutorial won't be "novel specific", the objective is too make it as general as possible.

This is why there is theory.

Then there's an example to illustrate.

Link to comment
Share on other sites

  • 4 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Create New...