Jump to content

Script Extraction Thread


Recommended Posts

On 1/28/2021 at 2:00 PM, Brandon DC115 said:

Hi i´m search a tool to translate Mashiro Iro Simphony use .DAT archives but i don´t know what tools to use

This might work
I havn't check it though
ましろ色シンフォニー sana edition will be released sometime in the future so you could wait it

https://web.archive.org/web/20190309075007/http://www.geocities.jp/hoku_hoshi/souko/souko078.html

Edited by rinnow
Link to comment
Share on other sites

  • 2 months later...

Hey together, I hope this is the right place. I have created a collection of tools for the game "Tropical Kiss" (https://vndb.org/v2516) and thought they might be helpful here. You can extract and repack all the .pak files the game uses as well as disassemble the scripts in the file called "Scenario.dat". And now I finally also managed to write a compiler that re-assembles the modified scripts to work for the game.

Link to the GitHub repository: https://github.com/Anonym271/ts-system-tools

The disassembled scripts are not really comfortable to write new scenarios with, but translation should be possible without any doubt. It is a rather old eroge, so I'm not sure if anyone is even interested in translating it or anything but I still thought this might be helpful here.

Link to comment
Share on other sites

  • 2 weeks later...
On 4/12/2021 at 3:45 AM, Anonym271 said:

Hey together, I hope this is the right place. I have created a collection of tools for the game "Tropical Kiss" (https://vndb.org/v2516) and thought they might be helpful here. You can extract and repack all the .pak files the game uses as well as disassemble the scripts in the file called "Scenario.dat". And now I finally also managed to write a compiler that re-assembles the modified scripts to work for the game.

Link to the GitHub repository: https://github.com/Anonym271/ts-system-tools

The disassembled scripts are not really comfortable to write new scenarios with, but translation should be possible without any doubt. It is a rather old eroge, so I'm not sure if anyone is even interested in translating it or anything but I still thought this might be helpful here.

I guess I am gonna do this...read the manga way back in time , it has several routes but only after princess Lover !

Link to comment
Share on other sites

Hi, I'm working on a translation of Dragon Knight 4 by élf. I have managed to extract .mes (game script) files from a file called "mes.arc" from the Windows version of the game. I have done this with GarBro, YUNOArcTools, and AE VN Tools. I can read and translate the text (currently translating in Seabose port town, village #6 in chronological order) but repacking it creates a huge .arc file (over 2x as big as the original one) that crashes the game.  As of now, I don't think I can do much but a text-only translation to be uploaded to Gamefaqs and the like, but I wanted to exhaust all options before giving up on hacking the game of DK4 itself. Going by AE VN tools, it seems DK4 for Windows has a proprietary file system, and AE can only read it, not write to it (the aforementioned exporting with AE was done in various .arc formats. that support reading and writing.) I've tried hacking the PC-98 .hdi file for DK4, but this is even harder.  I can provide more information if asked. I'm more knowledgable in Japanese than coding (though not fluent in Japanese, I have taken several years of classes.) So the help I need is technical. If anyone has any ideas, please do assist.

Link to comment
Share on other sites

On 4/21/2021 at 7:34 PM, Just Translate it said:

Can someone tell me how do I extract script for Princess Lover! vn...I have got a iso file for it...used winiso but there are several files and I don't know which one contains the script file for it....

Thanks!!!

I am not sure if I understood this correctly, but if you have the ISO of the game you should probably install it from this ISO first (mount it and then run the installer). After that navigate to the directory where you installed the game and then into "GameData". There you will find some .pack archives that contain the game files. You can unpack them using GARbro, but for repacking you will need to use another tool (I found this one after a quick search). The scripts are in the "data4.pack" archive, have a .s extension and are Shift-JIS encoded (I suggest editing them using Notepad++).

Link to comment
Share on other sites

On 4/25/2021 at 4:27 AM, ShintoCetra said:

Hi, I'm working on a translation of Dragon Knight 4 by élf. I have managed to extract .mes (game script) files from a file called "mes.arc" from the Windows version of the game. I have done this with GarBro, YUNOArcTools, and AE VN Tools. I can read and translate the text (currently translating in Seabose port town, village #6 in chronological order) but repacking it creates a huge .arc file (over 2x as big as the original one) that crashes the game.  As of now, I don't think I can do much but a text-only translation to be uploaded to Gamefaqs and the like, but I wanted to exhaust all options before giving up on hacking the game of DK4 itself. Going by AE VN tools, it seems DK4 for Windows has a proprietary file system, and AE can only read it, not write to it (the aforementioned exporting with AE was done in various .arc formats. that support reading and writing.) I've tried hacking the PC-98 .hdi file for DK4, but this is even harder.  I can provide more information if asked. I'm more knowledgable in Japanese than coding (though not fluent in Japanese, I have taken several years of classes.) So the help I need is technical. If anyone has any ideas, please do assist.

Are the original files smaller than 4 GB and the new ones bigger? If so, I guess the archive format only supports files up to 4 GB (2 GB may also be a possible limit). The format probably contains support for a compression algorithm that was used to compress the original archives but your repacker didn't implement this compression. This is why your new files are so much bigger and probably also why they exceed the maximum supported size of the archive format while the original ones didn't.

Link to comment
Share on other sites

1 hour ago, Anonym271 said:

Are the original files smaller than 4 GB and the new ones bigger? If so, I guess the archive format only supports files up to 4 GB (2 GB may also be a possible limit). The format probably contains support for a compression algorithm that was used to compress the original archives but your repacker didn't implement this compression. This is why your new files are so much bigger and probably also why they exceed the maximum supported size of the archive format while the original ones didn't.

Hi, thank you for responding. The original .arc file size is 4 MB (not GB). GarBro's repacked .arc file that crashes is 12.5 MB. I should clarify AE VN Tools usually extracts smaller .mes files that do repack properly, but they are presumably compressed as I cannot read the script within them. I've extracted/repacked with the .arc format "Will V2/V1" on Gabro, as they were the only .arc formats listed. With AE VN Tools, I've extracted with the .arc formats "Dragon Knight 4" "elf A16" and "Will v2", and repacked "elf A16" and "Will v2" that support repacking (again, the "Dragon Knight 4" format is read-only.) I tried all other .arc formats on AE, they did not produce results. Please let me know if you have any ideas, especially with either the repacking the .arc file and keeping the file size, or decompressing-then-recompressing the smaller .mes files.

Link to comment
Share on other sites

On 5/1/2021 at 5:08 AM, Anonym271 said:

I am not sure if I understood this correctly, but if you have the ISO of the game you should probably install it from this ISO first (mount it and then run the installer). After that navigate to the directory where you installed the game and then into "GameData". There you will find some .pack archives that contain the game files. You can unpack them using GARbro, but for repacking you will need to use another tool (I found this one after a quick search). The scripts are in the "data4.pack" archive, have a .s extension and are Shift-JIS encoded (I suggest editing them using Notepad++).

Hello thanks for replying, but the iso file is not for pc , it is PS2 iso file...I would attach a download link if you could help me with it...

Link to comment
Share on other sites

15 minutes ago, Just Translate it said:

Hello thanks for replying, but the iso file is not for pc , it is PS2 iso file...I would attach a download link if you could help me with it...

Oh, okay. This makes more sense now. I am pretty sure that is it not allowed to share pirated software here, but I will find my own way to the ISO ;)

Link to comment
Share on other sites

18 minutes ago, Anonym271 said:

Oh, okay. This makes more sense now. I am pretty sure that is it not allowed to share pirated software here, but I will find my own way to the ISO ;)

The iso contains a big .bin file , I believe it contains the script but I don't know how do I extract , so lt will be helpful if you can do it😊😊

Edited by Just Translate it
Link to comment
Share on other sites

On 5/2/2021 at 3:29 PM, Just Translate it said:

The iso contains a big .bin file , I believe it contains the script but I don't know how do I extract , so lt will be helpful if you can do it😊😊

Hm, the BIN file does not really have any obvious structure and also I am not very common with MIPS architecture (the one the PS2 uses). Do you really need the PS2 version? Because the files of the PC version are already well known...

Link to comment
Share on other sites

On 5/4/2021 at 11:49 PM, Anonym271 said:

Hm, the BIN file does not really have any obvious structure and also I am not very common with MIPS architecture (the one the PS2 uses). Do you really need the PS2 version? Because the files of the PC version are already well known...

The problem is that the pc version of this game is nowhere to be found , it is not available to even buy...and luckily I found the PS2 version of the same...o I was hoping to play it through emulator after I am done with scripts 😃😃

Link to comment
Share on other sites

On 5/4/2021 at 11:49 PM, Anonym271 said:

Hm, the BIN file does not really have any obvious structure and also I am not very common with MIPS architecture (the one the PS2 uses). Do you really need the PS2 version? Because the files of the PC version are already well known...

Hey bro , hopefully I found the pc version as you told I used Garbro to extract the files and yes I found the scripts but the link you shared to repack the script , I don't seem to understand that...can you give me some basic info how to perform the actions , plzz...I have been waiting for a long time to play this game in English translation...

Link to comment
Share on other sites

On 5/10/2021 at 7:55 PM, Just Translate it said:

Hey bro , hopefully I found the pc version as you told I used Garbro to extract the files and yes I found the scripts but the link you shared to repack the script , I don't seem to understand that...can you give me some basic info how to perform the actions , plzz...I have been waiting for a long time to play this game in English translation...

I guess the tool I sent you was actually for another version of the PACK files. But it looks like you found your solution in the data extraction thread already, right?

Link to comment
Share on other sites

Okay So I found a real awesome site that has a bunch if script unpacker and repackers. http://asmodean.reverse.net/pages/exmed.html and https://proger.me/vn/old/#arctool are great sites.

Looks like we finally get some decent tools for .MSD extraction and md_scr.med archive files. Inside are multiple scenario files and they are different types.

Upon extracting the main MSD archive file Each scenario is listed as a S001.MED, S002.MED, etc. You can not extract further. Extracting from an md_scr.me d archive will produce scripts with no extension, just a file name. An example would be 001_Ayana. However there is no extension and you can't extract further. So I'm kind of  looking for a universal text/script editor to translate and edit directory paths to images as I'm also doing decensor work. I tried pulling this into notepad++ but it looks like changing the encoding to JIS does not  do anything. Garbled mess most of the time. The .MSD files read better and I can see the lines in proper Japanese, but the directory trees that is supposed to show in script files are a garbled mess. Any scripts extracted from a md_scr.med archive won't even try to appear normal. Can't even see the lines to translate. Just garbage text. I think it has to do with computing language that's related to the games engine. In Notepad++ you can see how there is options for C++, python (in latest version), html, etc.  And I'm no sure which engine some of these games run on, nor am i sure of the scripting language that the games are programmed in. I think Notepad++ needs a plugin. But since I can't find any decent information I am in need of just a universal script editor that will work for any game's extract files.  The two archive files I listed above are archives from westvision's bakyuunyu kissa, released in 2006.(MSD archive type which I think uses kirikiri engine), and fudegaki-soft's Saiminjutsu series, starting with saiminjustsu3 and saiminjutsu 4. 

So what do I did to do? I don't want to use a different editor each time I'm want to use a different game.  Do I use a standard text editor or do I need some type of script editor? I'm in need of lot of help. And don't get me talking about games from Lune. I have been trying to edit script files from those games and they are a nightmare. All images for character sprites are x & y coordinates oriented when they are generated  on the screen. For Lune games image placement coordinate files have their own special script in a separate directory. That's why in the main scenario scripts may only call a back to an image file name but it won't call back to the coordinate files.  A long time ago I was able to kind of decipher in script files partially for Saiminjutsu 4 that the character's faces and body sprites are separate. Main scenario files usually call to only the sprite body. But the face is separate. An also depending on how zoomed in the character is there is may have a category for small and normal. But it's called back to a different script file which then has  it's own naming convention. But because script files are a garbled mess I can't see the call commands. I'm guessing. And this is why it's vital for me to see directory branch scripting and call back scripting properly when editing scripts.

So if anyone has any advise of a good text editor/script editor to use for both decensoring of images and for translating that would be really really helpful.

Link to comment
Share on other sites

Okay so an update. I was able to some what get notepad++ working to correctly show japanese for the script files that came from the md_scr.med file. NOw that I have gone and translated a short route and edited some pictures I am using exmed and merge_mpark2.exe to try to remerge the extracted script files back into an archive file called md_scr.med. But I don't know how.  the exmed.exe file works fine by draging and droping the archive to extract things. But I have no clue how to recompile. Help!

here is the original script file by Fudegaki soft.  Somebody give a wack at it and create a proper unpacker and repacker for the scenario archive file below. By the way I can't run python. I've tried to figure the damn thing out but I have no clue how it works. I'm a GUI type person.

https://drive.google.com/file/d/1RAorYKwc_QnMLUqPgyMVHywQaeTZLAH1/view?usp=sharing

Edited by Haiyami
Link to comment
Share on other sites

  • 2 weeks later...
On 5/26/2021 at 12:04 PM, Kcjpunk said:

guys can I need help I wanted to translate Atelier Sakura Team.NTR games but everytime i repack after using garbro for .xp3 file and replace that with the game folder i get error . it maybe due to I'm using visual code studio editor to change the subtittles if so what should i use to change subtitles ? need help 

Need a bit more information to help you there. What error?
What do you mean by subtitles, the movie subtitles or just the normal game text?
Are you repacking the whole scenario.xp3? If so, you should probably repack only your eddited files as a patch.xp3, patch2.xp3 or the next numbered patch available.

Edit: I've PM you as to not flood this thread. I can update it here when the problem is fixed for future reference.

Edited by darksshades
Link to comment
Share on other sites

Can someone who knows about data structures help me figure out this data files?
Here are the data files and the extraction/repack script if anyone wants to give it a try and help me.
https://www.mediafire.com/file/qfi8d5gfo46mq6g/data_files_and_extraction_script.zip/file

I don't really know anything about data extraction, but as far as I know all the strings are in plain text on the data files... just need to figure out the data struct and adjust the script so it can properly collect the strings and repack them in the right place.
The included files are actually only 2 of the data files extracted from the game data because of the size. The game is Dragon Carnival from Splush Wave.

So...
Someone made a script to extract the dialogue text from SPLUS_WAVE games but unfortunately it doesn't extract the RPG text.
I've just found out where those rpg texts are. But I don't have enough knowledge about data structures to figure it out how to properly extract/repack them.

For reference, the dialogue text are extracted using these structs

Spoiler



class SplushMBT(Structure):
    _fields_ = [
                ('Magic', c_char * 4),
                ('Sections', c_uint32),
                ('Events', c_uint32),
                ('SectionPos', c_uint32),
                ]
class SplushMBTSectionEntry(Structure):
    _fields_ = [
                ('Start', c_uint32),
                ('Num', c_uint32),
                ('FilenamePos', c_uint32),
                ]

 


And this is the data file (just the first part where SplusMBT struct is):

Spoiler

mes-0-dat.jpg


Now.... this is the file where the items descriptions text is:

Spoiler

DID-DN-PARA-dat-2.jpg

It uses the identifier DAP instead of MBL0 and is very different with how its structed.

Any help pointing me to the right direction is welcome.
Thanks.

 

--------------------------- Edit:

I've made a really stupid script that can extract and replace the strings on the data file which have the game rpg elements.

I don't have the proper knowledge to figure out how the headers work so I basically just brute force the whole file.  Seaching for japanese text and replacing it.
It all seems to work but you need to be carefull not to overflow onto the next string or header as can't change the position where the strings are referenced.

Here is the script:  https://www.mediafire.com/file/ygg52jnelscwm1b/extract_repack_RPG_text.rar/file
Script files inside spoiler at the bottom of post.

You pass the data file name so in the case of Dragon Carnival:

python rpg_out.py DID_DN_PARA_DAT

will create a DID_DN_PARA_DAT_RPG_TXT.txt with the strings

python rpg_in.py DID_DN_PARA_DAT

will actually create a DID_DN_PARA_DAT_parsed with the strings replaced based on DID_DN_PARA_DAT_RPG_TXT.txt

---- Mediafire links deleted, the files are in the spoiler

rpg_out.py

Spoiler

from ctypes import *
from struct import pack,unpack,unpack_from
import os,sys
import pdb
import unicodedata

DEBUG = False

def log(s):
    if DEBUG:
        print(s)


"""
 parse found jap text into translate_str, abs_id shows the position before the str
 already decode text_jis
"""
def parse_working_string(text_jis, abs_id):
    dwidth_chars = 0
    log(f"--Start parsing: {text_jis} at byte id {abs_id}")
    found_jap = True
    found_anomaly = False
    id_found = 0
    id_len = 0
    for i in range(len(text_jis)):
        #print('-----')
        #log(f"hex is : {bytes(text_jis[i], 'shift-jis')} at {abs_id+i+dwidth_chars}")
        # Double byte '81 7C'(ー) identified as Neutral single-byte, so double-checking length
        binary_len = len(bytes(text_jis[i], 'shift-jis'))
        '''
        if bytes(text_jis[i], 'shift-jis') ==  b'\x81|':
            dwidth_chars += 1
            log(f"___________\n\n\PROBLEM_BIT\n\n______")
            log(f"len is : {hex(ord(text_jis[i]))}")
            log(f"len is : {len(str(ord(text_jis[i])))}")
            continue
        '''
        if unicodedata.east_asian_width(text_jis[i]) != 'N' and unicodedata.east_asian_width(text_jis[i]) != 'Na' and \
            unicodedata.east_asian_width(text_jis[i]) != 'H':
            log(f"-------------\n"
                  f"{unicodedata.east_asian_width(text_jis[i])} -- {dwidth_chars} -- {bytes(text_jis[i], 'shift-jis')} "
                  f"\n----"
                  )
            dwidth_chars += 1
        elif binary_len >= 2: # Missinterpreted as N/Na/H but is double byte, so skip
            log(f"___________\n\n\LEN_BIT\n\n______")
            log(f"len is : {hex(ord(text_jis[i]))}")
            log(f"len is : {binary_len}")
            log(f"hex is : {bytes(text_jis[i], 'shift-jis')}")
            dwidth_chars += 1
            found_anomaly = True
            # Double byte '81 7C'(ー) identified as Neutral but should be counted as a jap character, so except that, continue
            if bytes(text_jis[i], 'shift-jis') !=  b'\x81|':
                continue
            
        # Found a wide or full-width character
        if unicodedata.east_asian_width(text_jis[i]) == 'W' or unicodedata.east_asian_width(text_jis[i]) == 'F' or unicodedata.east_asian_width(text_jis[i]) == 'A'\
          or bytes(text_jis[i], 'shift-jis') ==  b'\x81|':
            log(f"Found {text_jis[i]} at {i}")
            id_len += 1
            if found_jap == False:
                found_jap = True
                id_found = i
        else: # found at least 2 chars, save to dict
            if found_jap == True:
                found_jap = False
                log(f"Break chain. Length {id_len}")
                if id_len > 1:
                    translate_str[abs_id+id_found+dwidth_chars-id_len] = text_jis[id_found : id_found+id_len]
                    log(f"Found jap text: {text_jis[id_found : id_found+id_len]} at relative {id_found}, absolute {abs_id+id_found+dwidth_chars-id_len}")

                id_len = 0
                
                
                
                
                
                
# ---------------------------


filename = "RPG_DAT_cut"
if len(sys.argv) > 1:
    filename = sys.argv[1]

print(f"Parsing file: {filename}\n")

fd = open(filename, 'rb')
t = fd.read()

#finished = False

#while finished == False:
p0, p1 = t, 0
iStart = 0
finished = False

translate_str = {}
totalsize = os.fstat(fd.fileno()).st_size
min_size_update = totalsize / 50.0
lastUpdate = min_size_update

print("totalsize " + str(totalsize))
while finished == False:
    try:
        if iStart > lastUpdate :
            print("Parsing file {:.2f}%...".format((iStart/totalsize)*100))
            lastUpdate = iStart+min_size_update
        #log(f"iStart = {iStart}")
        #log(t[iStart:].decode('shift-jis'))
        decoded_str = t[iStart:].decode('shift-jis')
        parse_working_string(decoded_str, iStart)
    except UnicodeDecodeError as e:
        if e.args[4]=="illegal multibyte sequence":
          log(e)
          #print(e.args[1])
          #p0, p1 = e.args[2], e.args[3]

          log(f"Error encoding at {e.start} + {e.end}.")
          decoded_str = t[iStart:iStart+e.start].decode('shift-jis')
        
          parse_working_string(decoded_str, iStart)
          iStart += e.end
          #log(f"Parsed. Next iter with(+{e.end}) iStart = {iStart}")
          continue
        
        #print(t[p1:].decode('shift-jis'))
    #continue
    finished = True

log(f"translations found: {translate_str}")

print(f"\n\n{len(translate_str)} strings found")

filename_script = ''.join([filename, '_RPG_TXT.txt'])
 
out = []
with open(filename_script, 'w' ,encoding='utf8') as f2:
    for k in translate_str:
        l = len(translate_str[k])
        out.append(
            f"\n===========================\n"
            f"{k} {l} file position and original length"
            f"\n---------------------------\n")
        out.append(translate_str[k])
        
        
    f2.write(''.join(out))  
  
print(f"Extracted strings to {filename_script}.\n Press enter to end.")
a = input()

 

rpg_in.py

Spoiler

from ctypes import *
from struct import pack,unpack,unpack_from
import os,sys
import pdb
import unicodedata



DEBUG = False

def log(s):
    if DEBUG:
        print(s)

try:
    # Try to get filename from argv
    filename_data = "RPG_DAT_cut"
    if len(sys.argv) > 1:
        filename_data = sys.argv[1]

    scriptfile_end = '_RPG_TXT.txt'
    filename_script = ''.join([filename_data, scriptfile_end])

    # passed script as argv instead of datafile
    if filename_data.endswith(scriptfile_end):
        filename_script = filename_data
        filename_data = filename_data[:-len(scriptfile_end)]

    print(f"Parsing translation file: {filename_data}\n")

    with open(filename_script,'r',encoding='utf8') as f:
        fd = f.read().splitlines()  

    # list of list containing [Pos, trans, originalSize]
    trans = []

    i = 0
    length = len(fd)
    while i < length:
        # Skips to next '==' line
        while i < length and not fd[i].startswith('='):
            i += 1        
        if i >= length:
            break;
        
        ## Text to be translated
        # print(fd[i+3])
        # next line should be a text line
        sec_pos = int(fd[i+1].split(' ')[0])
        sec_text = fd[i+3]
        sec_len = int(fd[i+1].split(' ')[1]) * 2 # because original is wide and len is in bytes
        
        log(f"Pos: {sec_pos} Len: {sec_len} i: {i}")
        
        trans.append([sec_pos, sec_text, sec_len]);
        
        i += 3
        
    # Now parsing translated string to data file
    import shutil
    filename_data_parsed = ''.join([filename_data, '_parsed'])
    shutil.copyfile(filename_data, filename_data_parsed)


    w = open(filename_data_parsed, 'rb+')

    for item in trans:
        log(item)
        bstr = bytearray(item[1], encoding='CP932')
        added_clear_bytes = 0
        if len(bstr) < item[2]:
            added_clear_bytes = item[2] - len(bstr)
            log(f"Added clear bytes: {added_clear_bytes}")
        
        log(f"Seek at {item[0]}")
        w.seek(item[0], 0)
        w.write(bstr)
        for _ in range(added_clear_bytes):
            log('clear byte')
            w.write(b'\x00')
        

    print(f"Parsed translations to {filename_data_parsed}")
except FileNotFoundError as not_found:
    print(f"\n-------------------\n"
          f"File not found: {not_found.filename}"
          f"\n-------------------")
except:
    print(sys.exc_info())
finally:
    print(f"\n Press enter to end")
    a = input()

 

 

Edited by darksshades
Link deleted. Re-adding.
Link to comment
Share on other sites

  • 4 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...