Jump to content

Script Extraction Thread


Recommended Posts

Can someone who knows about data structures help me figure out this data files?
Here are the data files and the extraction/repack script if anyone wants to give it a try and help me.
https://www.mediafire.com/file/qfi8d5gfo46mq6g/data_files_and_extraction_script.zip/file

I don't really know anything about data extraction, but as far as I know all the strings are in plain text on the data files... just need to figure out the data struct and adjust the script so it can properly collect the strings and repack them in the right place.
The included files are actually only 2 of the data files extracted from the game data because of the size. The game is Dragon Carnival from Splush Wave.

So...
Someone made a script to extract the dialogue text from SPLUS_WAVE games but unfortunately it doesn't extract the RPG text.
I've just found out where those rpg texts are. But I don't have enough knowledge about data structures to figure it out how to properly extract/repack them.

For reference, the dialogue text are extracted using these structs

Spoiler



class SplushMBT(Structure):
    _fields_ = [
                ('Magic', c_char * 4),
                ('Sections', c_uint32),
                ('Events', c_uint32),
                ('SectionPos', c_uint32),
                ]
class SplushMBTSectionEntry(Structure):
    _fields_ = [
                ('Start', c_uint32),
                ('Num', c_uint32),
                ('FilenamePos', c_uint32),
                ]

 


And this is the data file (just the first part where SplusMBT struct is):

Spoiler

mes-0-dat.jpg


Now.... this is the file where the items descriptions text is:

Spoiler

DID-DN-PARA-dat-2.jpg

It uses the identifier DAP instead of MBL0 and is very different with how its structed.

Any help pointing me to the right direction is welcome.
Thanks.

 

--------------------------- Edit:

I've made a really stupid script that can extract and replace the strings on the data file which have the game rpg elements.

I don't have the proper knowledge to figure out how the headers work so I basically just brute force the whole file.  Seaching for japanese text and replacing it.
It all seems to work but you need to be carefull not to overflow onto the next string or header as can't change the position where the strings are referenced.

Here is the script:  https://www.mediafire.com/file/ygg52jnelscwm1b/extract_repack_RPG_text.rar/file
Script files inside spoiler at the bottom of post.

You pass the data file name so in the case of Dragon Carnival:

python rpg_out.py DID_DN_PARA_DAT

will create a DID_DN_PARA_DAT_RPG_TXT.txt with the strings

python rpg_in.py DID_DN_PARA_DAT

will actually create a DID_DN_PARA_DAT_parsed with the strings replaced based on DID_DN_PARA_DAT_RPG_TXT.txt

---- Mediafire links deleted, the files are in the spoiler

rpg_out.py

Spoiler

from ctypes import *
from struct import pack,unpack,unpack_from
import os,sys
import pdb
import unicodedata

DEBUG = False

def log(s):
    if DEBUG:
        print(s)


"""
 parse found jap text into translate_str, abs_id shows the position before the str
 already decode text_jis
"""
def parse_working_string(text_jis, abs_id):
    dwidth_chars = 0
    log(f"--Start parsing: {text_jis} at byte id {abs_id}")
    found_jap = True
    found_anomaly = False
    id_found = 0
    id_len = 0
    for i in range(len(text_jis)):
        #print('-----')
        #log(f"hex is : {bytes(text_jis[i], 'shift-jis')} at {abs_id+i+dwidth_chars}")
        # Double byte '81 7C'(ー) identified as Neutral single-byte, so double-checking length
        binary_len = len(bytes(text_jis[i], 'shift-jis'))
        '''
        if bytes(text_jis[i], 'shift-jis') ==  b'\x81|':
            dwidth_chars += 1
            log(f"___________\n\n\PROBLEM_BIT\n\n______")
            log(f"len is : {hex(ord(text_jis[i]))}")
            log(f"len is : {len(str(ord(text_jis[i])))}")
            continue
        '''
        if unicodedata.east_asian_width(text_jis[i]) != 'N' and unicodedata.east_asian_width(text_jis[i]) != 'Na' and \
            unicodedata.east_asian_width(text_jis[i]) != 'H':
            log(f"-------------\n"
                  f"{unicodedata.east_asian_width(text_jis[i])} -- {dwidth_chars} -- {bytes(text_jis[i], 'shift-jis')} "
                  f"\n----"
                  )
            dwidth_chars += 1
        elif binary_len >= 2: # Missinterpreted as N/Na/H but is double byte, so skip
            log(f"___________\n\n\LEN_BIT\n\n______")
            log(f"len is : {hex(ord(text_jis[i]))}")
            log(f"len is : {binary_len}")
            log(f"hex is : {bytes(text_jis[i], 'shift-jis')}")
            dwidth_chars += 1
            found_anomaly = True
            # Double byte '81 7C'(ー) identified as Neutral but should be counted as a jap character, so except that, continue
            if bytes(text_jis[i], 'shift-jis') !=  b'\x81|':
                continue
            
        # Found a wide or full-width character
        if unicodedata.east_asian_width(text_jis[i]) == 'W' or unicodedata.east_asian_width(text_jis[i]) == 'F' or unicodedata.east_asian_width(text_jis[i]) == 'A'\
          or bytes(text_jis[i], 'shift-jis') ==  b'\x81|':
            log(f"Found {text_jis[i]} at {i}")
            id_len += 1
            if found_jap == False:
                found_jap = True
                id_found = i
        else: # found at least 2 chars, save to dict
            if found_jap == True:
                found_jap = False
                log(f"Break chain. Length {id_len}")
                if id_len > 1:
                    translate_str[abs_id+id_found+dwidth_chars-id_len] = text_jis[id_found : id_found+id_len]
                    log(f"Found jap text: {text_jis[id_found : id_found+id_len]} at relative {id_found}, absolute {abs_id+id_found+dwidth_chars-id_len}")

                id_len = 0
                
                
                
                
                
                
# ---------------------------


filename = "RPG_DAT_cut"
if len(sys.argv) > 1:
    filename = sys.argv[1]

print(f"Parsing file: {filename}\n")

fd = open(filename, 'rb')
t = fd.read()

#finished = False

#while finished == False:
p0, p1 = t, 0
iStart = 0
finished = False

translate_str = {}
totalsize = os.fstat(fd.fileno()).st_size
min_size_update = totalsize / 50.0
lastUpdate = min_size_update

print("totalsize " + str(totalsize))
while finished == False:
    try:
        if iStart > lastUpdate :
            print("Parsing file {:.2f}%...".format((iStart/totalsize)*100))
            lastUpdate = iStart+min_size_update
        #log(f"iStart = {iStart}")
        #log(t[iStart:].decode('shift-jis'))
        decoded_str = t[iStart:].decode('shift-jis')
        parse_working_string(decoded_str, iStart)
    except UnicodeDecodeError as e:
        if e.args[4]=="illegal multibyte sequence":
          log(e)
          #print(e.args[1])
          #p0, p1 = e.args[2], e.args[3]

          log(f"Error encoding at {e.start} + {e.end}.")
          decoded_str = t[iStart:iStart+e.start].decode('shift-jis')
        
          parse_working_string(decoded_str, iStart)
          iStart += e.end
          #log(f"Parsed. Next iter with(+{e.end}) iStart = {iStart}")
          continue
        
        #print(t[p1:].decode('shift-jis'))
    #continue
    finished = True

log(f"translations found: {translate_str}")

print(f"\n\n{len(translate_str)} strings found")

filename_script = ''.join([filename, '_RPG_TXT.txt'])
 
out = []
with open(filename_script, 'w' ,encoding='utf8') as f2:
    for k in translate_str:
        l = len(translate_str[k])
        out.append(
            f"\n===========================\n"
            f"{k} {l} file position and original length"
            f"\n---------------------------\n")
        out.append(translate_str[k])
        
        
    f2.write(''.join(out))  
  
print(f"Extracted strings to {filename_script}.\n Press enter to end.")
a = input()

 

rpg_in.py

Spoiler

from ctypes import *
from struct import pack,unpack,unpack_from
import os,sys
import pdb
import unicodedata



DEBUG = False

def log(s):
    if DEBUG:
        print(s)

try:
    # Try to get filename from argv
    filename_data = "RPG_DAT_cut"
    if len(sys.argv) > 1:
        filename_data = sys.argv[1]

    scriptfile_end = '_RPG_TXT.txt'
    filename_script = ''.join([filename_data, scriptfile_end])

    # passed script as argv instead of datafile
    if filename_data.endswith(scriptfile_end):
        filename_script = filename_data
        filename_data = filename_data[:-len(scriptfile_end)]

    print(f"Parsing translation file: {filename_data}\n")

    with open(filename_script,'r',encoding='utf8') as f:
        fd = f.read().splitlines()  

    # list of list containing [Pos, trans, originalSize]
    trans = []

    i = 0
    length = len(fd)
    while i < length:
        # Skips to next '==' line
        while i < length and not fd[i].startswith('='):
            i += 1        
        if i >= length:
            break;
        
        ## Text to be translated
        # print(fd[i+3])
        # next line should be a text line
        sec_pos = int(fd[i+1].split(' ')[0])
        sec_text = fd[i+3]
        sec_len = int(fd[i+1].split(' ')[1]) * 2 # because original is wide and len is in bytes
        
        log(f"Pos: {sec_pos} Len: {sec_len} i: {i}")
        
        trans.append([sec_pos, sec_text, sec_len]);
        
        i += 3
        
    # Now parsing translated string to data file
    import shutil
    filename_data_parsed = ''.join([filename_data, '_parsed'])
    shutil.copyfile(filename_data, filename_data_parsed)


    w = open(filename_data_parsed, 'rb+')

    for item in trans:
        log(item)
        bstr = bytearray(item[1], encoding='CP932')
        added_clear_bytes = 0
        if len(bstr) < item[2]:
            added_clear_bytes = item[2] - len(bstr)
            log(f"Added clear bytes: {added_clear_bytes}")
        
        log(f"Seek at {item[0]}")
        w.seek(item[0], 0)
        w.write(bstr)
        for _ in range(added_clear_bytes):
            log('clear byte')
            w.write(b'\x00')
        

    print(f"Parsed translations to {filename_data_parsed}")
except FileNotFoundError as not_found:
    print(f"\n-------------------\n"
          f"File not found: {not_found.filename}"
          f"\n-------------------")
except:
    print(sys.exc_info())
finally:
    print(f"\n Press enter to end")
    a = input()

 

 

Edited by darksshades
Link deleted. Re-adding.
Link to comment
Share on other sites

  • 4 weeks later...
  • 2 weeks later...
On 6/25/2021 at 2:24 PM, Tgsaudoi said:

So About Silky engine, im use Gabro to extract MES files from script.arc and done with edit MES files. So how can I pack it in Script.arc again or how to use it?

You don't need to actually repack it. More data about translation the games on Silky Engine can be found here (warning: Russian only!). See " Как работать с архивами? " section there.
Also you may want to use this tool to edit MES files.

Link to comment
Share on other sites

11 hours ago, Tester said:

You don't need to actually repack it. More data about translation the games on Silky Engine can be found here (warning: Russian only!). See " Как работать с архивами? " section there.
Also you may want to use this tool to edit MES files.

So how can i use the edited MES files?, where i have to put it?

Link to comment
Share on other sites

  • 3 weeks later...

Once again, this thread should be updated, because there was new tool created.
See this topic.

===

After some time tampering with SLG System engine, I did create a tool to work with it's scripts. And not just simple string editor, no.

SLGScriptTool is dual languaged GUI tool for (de)compiling and (de/en)crypting (with key finding) scripts of SLG System engine. Supports all known versions of SLG System: 0, 1, 2, 3 (3.0, 3.1), 4 (4.0, 4.1), but may lack of support of some it's variations. With this tool you can: decompile and compile script of SLG System, (en/de)crypt script of any game on SLG System, find key of any game on SLG System via cryptoattack.

Link for the tool: https://github.com/TesterTesterov/SLGSystemScriptTool
Built release (for x64 and x32): https://github.com/TesterTesterov/SLGSystemScriptTool/releases/tag/v1.0

Why SLG System is so noteworthy? Legendary series Sengoku Hime and Sangoku Hime (PC version) does use this engine (well, Sengoku Hime 7 and probably Sangoku Hime 5 does use Unity, but pay it no mind).

Now with SLGSystemDataTool and some HEX-editing of supplement data files it is possible to translate whose game series.

Edited by Tester
New topic.
Link to comment
Share on other sites

  • 4 weeks later...

Hi. I wish to translate the game "Yami no koe" / "The Voice in the Night" by Black Cyc (https://vndb.org/v3334).

I search on this forum and found Garbo. I tried this tool and can extract every game arts but not the script. The scripts are already editable.

The scripts are .spt file. I have found post talking about a tool gsppt but can't find how to download it.

Does someone have a tool for depacking / packing script file ?

Link to comment
Share on other sites

Hey, i extracted the script from https://vndb.org/v5233, and i can not read it (nor can i open the cgs or the ost). I guess it's encrypted, but i have no idea how to open them, even after trying what's in the op. I don't know even if the files extracted are or not corrupted.

I put the scripts and .xp3 file in here, in case someone can help me.

https://mega.nz/folder/5oZhAAbY#EZYphfxwWIaO0szMlp9cBw

Link to comment
Share on other sites

  • 2 weeks later...

Hi! I've been trying to work with Sadistic Blood scripts. Since it's from Black Cyc, gsspt tool can extract text, but when I try to insert changed text nothing in .spt file changes. So it doesn't in game.

The only difference I noticed between Gore Screaming Show scripts and Sadistic Blood's are paragraphs, but it doesn't look like an issue.

Does someone know a way to workaround it?

Link to comment
Share on other sites

  • 1 month later...

I've been working on extracting the script files from Reminiscence that are in the .mjo file format(https://vndb.org/v7773) because there is someone that I am helping that is fluent in Japanese and wants to mess around with the script files. I've been able to extract every script except around 10 scripts that seem to be related to the story.

The tool that I was using to extract the scripts is https://github.com/Inori/FuckGalEngine/tree/master/Majiro/mjdev and the error I get when I try to extract the stubborn scripts is "Fatal error: exception Failure("unknown command text_control_7a at 0x00199d")", "Fatal error: exception Failure("unknown command text_control_64 at 0x006002")",  and"Fatal error: exception Failure("unknown command op845 at 0x0024c3")" seem to be the most common errors.

I found a tool that actually seems to be for the game (https://github.com/regomne/chinesize/tree/master/Majiro/hook_proj)  but I've been unable to build it.

Does anyone know a work around or another tool that might work?

Edited by kralc
Forgot to add script file format
Link to comment
Share on other sites

1 hour ago, kralc said:

I've been working on extracting the script files from Reminiscence that are in the .mjo file format(https://vndb.org/v7773) because there is someone that I am helping that is fluent in Japanese and wants to mess around with the script files. I've been able to extract every script except around 10 scripts that seem to be related to the story.

The tool that I was using to extract the scripts is https://github.com/Inori/FuckGalEngine/tree/master/Majiro/mjdev and the error I get when I try to extract the stubborn scripts is "Fatal error: exception Failure("unknown command text_control_7a at 0x00199d")", "Fatal error: exception Failure("unknown command text_control_64 at 0x006002")",  and"Fatal error: exception Failure("unknown command op845 at 0x0024c3")" seem to be the most common errors.

I found a tool that actually seems to be for the game (https://github.com/regomne/chinesize/tree/master/Majiro/hook_proj)  but I've been unable to build it.

Does anyone know a work around or another tool that might work?

from the error messages, it seems that tool isn't able to understand those opcodes in the script.
0x7A, 0x64 and the third one...
usually means programmer hasn't come across those OP codes in the game engine yet, or you're working with a later version of the game engine

Link to comment
Share on other sites

21 hours ago, kralc said:

I've been working on extracting the script files from Reminiscence that are in the .mjo file format(https://vndb.org/v7773) because there is someone that I am helping that is fluent in Japanese and wants to mess around with the script files. I've been able to extract every script except around 10 scripts that seem to be related to the story.

The tool that I was using to extract the scripts is https://github.com/Inori/FuckGalEngine/tree/master/Majiro/mjdev and the error I get when I try to extract the stubborn scripts is "Fatal error: exception Failure("unknown command text_control_7a at 0x00199d")", "Fatal error: exception Failure("unknown command text_control_64 at 0x006002")",  and"Fatal error: exception Failure("unknown command op845 at 0x0024c3")" seem to be the most common errors.

I found a tool that actually seems to be for the game (https://github.com/regomne/chinesize/tree/master/Majiro/hook_proj)  but I've been unable to build it.

Does anyone know a work around or another tool that might work?

I actually found a tool to view and edit the text (https://github.com/regomne/lneditor) but I'm not sure how to save it correctly because the GUI is in Chinese. Would someone be able to help me?

  p2xvdBz.png

 

Link to comment
Share on other sites

  • 2 weeks later...

Hello, currently i am having Chinese version of Kajiri Kamui Kagura, i want to extract the script but so far no method has been working for me such as Enigma or Garbro. Can someone help me on this? I am suspecting that the exe file might be the same with Fortissimo Chinese.

Here's the game in case someone need: https://drive.google.com/file/d/1aXiUB6JjHEJrdkNWzH4zYhHuH2kQWxgK/view

Link to comment
Share on other sites

  • 2 weeks later...

Hey there. I want to access the script files of Starless.
I managed to extract the scenario.arc file and it gave me .mjo files that seem to be encrypted since they show gibberish. What tool can I use for that? 
I must point out I extracted the script files for the ENGLISH version, in case that might the reason the script is showing as gibberish. This is just for personal use to compare and study the translation. I will extract the Japanese script later.

Link to comment
Share on other sites

  • 2 weeks later...
  • 2 weeks later...
On 11/7/2021 at 10:29 AM, dullian said:

Hey there. I want to access the script files of Starless.
I managed to extract the scenario.arc file and it gave me .mjo files that seem to be encrypted since they show gibberish. What tool can I use for that? 
I must point out I extracted the script files for the ENGLISH version, in case that might the reason the script is showing as gibberish. This is just for personal use to compare and study the translation. I will extract the Japanese script later.

It's majiro engine https://github.com/Inori/FuckGalEngine/tree/master/Majiro/mjdev use that to extract the script

Link to comment
Share on other sites

On 10/15/2021 at 9:36 AM, kralc said:

I actually found a tool to view and edit the text (https://github.com/regomne/lneditor) but I'm not sure how to save it correctly because the GUI is in Chinese. Would someone be able to help me?

  p2xvdBz.png 

I know how to hack reminiscence, I tried both Ineditor and mjodev, guess what it works, if you want to translate to English then it's easy. I tried with my language:unknown.png?width=754&height=452

Link to comment
Share on other sites

  • 4 weeks later...
On 5/30/2021 at 8:53 AM, darksshades said:

Now.... this is the file where the items descriptions text is:

Hmm, so thou had problem with that text structure? Probably late, but I'll give one hint about this one structure. That description text "in the sea of 0x00" is somewhat close to... yes, "fixed string in the sea of 0x00", as I call it.

Basically, there are N bytes in this structure (often 2^n, but not always). If the actual string is not "", then the structure begins from the string bytes. Basically, it has constant size unrelated to string size. To unpack it just... read all N bytes from the structure and from it take all non-0x00 bytes, then decode. You'll have string. I think you've already know how to pack it now.

Edited by Tester
Link to comment
Share on other sites

So long I was wanting to do that, yet only now I had time to.

In the previous year I did hack Silky Engine's mes files and created a tool, mesScriptAsseAndDisassembler. Alas, it was (like gscScriptCompAndDecompiler) far from ideal. It worked with general scripts, but 1) it could not (dis)assemble any technical scripts; 2) it did not work with LIBLARY.LIB, specific script; 3) it did not worked if one of parameters had one unimplemented value. Also it had some problems with optimization, code style, strings, lack of mass conversion and such.

But the time changed. Here is new mesScriptAsseAndDisassembler 1.2.

Now this tool was fully refactored and all of them were fixed. Now you can use it without fear of common problems.
Here is the change log...

1) Full code refactoring in both GUI and script managing class.
2) Normal tests in main.
3) Big optimization for mes script management.
4) Many mistakes, such as incorrect bytes or incorrect string returning, fixed.
5) Added directory management option.
6) Added some convenient functionality in GUI. For example, auto guessing name of other file after choosing one.
7) Now use threading system, so no locking GUI (well, almost... it can be for a very little time in case of directory management).
8 ) Completely new installation script.
9) More data in .exe.
10) Help data updated.
11) Now support more scripts and gives more correct output to some arguments due to implementing new argument parsing system (close to that in SLGSystemScriptTool).
12) Fixed very rare case then header contain one more section.
13) Some new commands were named.
14) Now work with LIBLARY.LIB like with any mes script.

Edited by Tester
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...