International Patching System
What is IPS?
The IPS or International Patching System filetype was created in 1993 to express “diffs” between a controlled base file and a resultant target file. MS-DOS at this point was the most popular operating system and some versions could not store a 32 bit number due to limitations of the time.
The IPS filetype was developed by Japanese “ROM-Hackers” who wished to share their modifications in a way that did not promote or commit acts of digital piracy, these files were spread on file hosting/shareware and forum websites before mainstream ROM-Hacking communities were established such as Super Mario World Central which stores the largest archive of Super Mario World ROM-Hacks, and is the most popular use for this filetype.
- Popular IPS tools include:
Lunar IPS is the most popular, but does not support BPS like Floating IPS does. All the above are designed for Windows Systems except EWing IPS Patcher which should work on any Posix System.
IPS is designed for 90’s Windows Operating Systems such as MS-DOS, IPSMac is designed for early Macintosh systems, and JIPS is simply an ips handler designed in Java.
- As well as more developmental IPS tools:
IPS Peek and Chief-Net IPS both allow selective patching, meaning that parts of the IPS may be excluded or included when patching by the users choice. ips.py is a Python ips tool, however it is not as versatile as patchlib.ips which is the recommended Python IPS Module.
ROM Patcher JS, however eradicates the usage of executable “diff” appliers/makers as the tool is made entirely in JavaScript and therefore can use mod files and base files in a browser. (The Exception being xdelta files that use Xdelta3 format which requires an x64/ARM environment that most web services cannot offer)
How does it work?
Simple, an IPS file has a static Header reading PATCH and a Footer reading EOF. The middle of the IPS file is a series of unterminated series of bytes. These “instances” come in two forms in variable size.
Example of a noRLE Instance:
01 23 45 00 05 67 89 AB CD EF
To make that easier to read, let’s break it down.
01 23 45 | 00 05 | 67 89 AB CD EF
The first three bytes is a 24 bit number for the target offset, the next two bytes is a 16 bit number for the size of the data. The final series of bytes is the data of which it’s length is described by the 16 bit size number.
Example of an RLE Instance:
01 23 45 00 00 FF FF 64
The first three bytes are still the 24 bit target offset however you may notice the 16 bit size bytes are equal to zero, indicating that the data is zero in length. However, when the size is equal to zero we treat this is an RLE flag indicating that this instance behaves differently.
We read past the two size bytes and take the next two bytes as a 16 bit number describing the hunk length of the RLE instance. The last byte is the hunk data and will be repeated until it’s length is equal to the 16 bit “hunk length”.
Spaces in-between offsets beyond the size of the original file size are expected to be zeroes, drastically reducing file size and likely complexity too leading to faster handling. When not storing past the original file the space in between offsets stores the original file contents which are not included in the ips for both efficiency and integrity.
How to read/apply one?
Here it will be demonstrated in very simple Python, but annotated well even when the code is verbose.
def apply(base : bytes, patch : bytes) -> bytes:
patch = patch[5:-3] #trim header and footer
changes = {} #dictionary to store diffs
count = 0 #create variable used to track progress in "patch"
while count != len(patch): #Until we have read the last byte
offset = patch[count:count+3] #read next 3 bytes
offset = int(offset.hex(),16) #convert to 24 bit integer
count += 3
size = patch[count:count+2] #read next two bytes
size = int(size.hex().16) #convert to 16 bit integer
if not size: #if RLE flag set
count += 2
size = patch[count:count+2] #Acces RLE Length bytes
size = int(size.hex(),16) #convert to 16 bit number
count += 2
data = patch[count:count+1]
changes[offset] = (size,data) #Store RLE instance with offset as key
else:
data = patch[count:count+size]
changes[offset] = data #Store noRLE instance with offset as key
count += size
output = b""
for offset in changes:
if offset < len(base): #if we are still overwriting
output += base[len(output):offset] #Copy base until diff start
else:
output += b"\x00" * (offset-len(base)) #Write zeroes until diff start
if isinstance(changes[offset],tuple):
output += changes[offset][0]*changes[offset]*1
else: output += changes[offset]
output += base[len(output):] #if we have not wrote up to base, then do so
return output
The code above accepts two bytes objects and will return ` byets` object which could be parsed into a file object. If you only needed this data for patching then you could :
def patchfile(modfile,basefile,outfile):
def get(File):
with open(File,"rb") as f:
return f.read()
with open(outfile,"wb") as f:
f.write(patch(get(base),get(mod)))
However as ipsluna is a module, usage is determined by the user and therefore despite the applications beyond standard usage being nothing short of eccentric does not invalidate the intentions. This is where iplsuna exceeds ips.py.
How does `ips` building work?
ips constructing is much more detailed than ips applying, as we have to account for the following things:
ipsfiles should contain minimal original data.*ipsfiles should not attempt to make an impossibly large file.**ipsfiles should preferrleunless setup is too costly.***ipsfiles must write to the last byte of the new file if bigger , even if zero.
* This does not mean that it won’t work, it just means that you may end up creating an unnecessarily large file that contains potentially sensitive data
** By default in `patchlib` it is set to `16,777,215 bytes` ( 16.7 MB) however `ips` may reach up to `16,842,750 bytes` by setting `legacy` to `False`
*** This is merely optimization, no `ips` has to contain `rle` however it should be noted that it is only optimal if the `rle` is of length `9` or higher.
Now that you know the rules, we can begin to create an ips file.
def build(base : bytes, target : bytes) -> bytes:
patch,count = b"", 0
#Lambdas for operation viability checks
viability = lambda offset, dist: target[offset].to_bytes(1, "big")*dist == target[offset : offset + dist]
compare = lambda offset: (base[offset] != target[offset]) if offset < len(base) else True
def rle(): #function for processing rle data
length = 9
while compare(count + length) and count + length < len(target) and viability(count, length): length += 1
return length - 1
def norle(): #function for processing rle unviable data
length = 1
while compare(count + length) and count + length < len(target) and not (viability(count + length, 9) and all(compare(count + length + r) for r in range(9))): length += 1
return length
#while we have not compared the final byte
while count < len(target):
#if we are comparing the final byte
if count == len(target)-1:
patch += count.to_bytes(3, "big")+b"\x00\x01"+target[count].to_bytes(1, "big")
count += 1
#if we have unncessary data
elif base[count] == target[count] if count < len(base) else target[count] == 0:
while (base[count] == target[count] if count < len(base) else target[count] == 0) if count < len(target) - 1 else False: count += 1
#now that we have our diff
else:
#determinte rle viability
isrle = viability(count, 9) and all(compare(count + r) for r in range(9))
length = [norle,rle][isrle]() #retrieve length to store
#while length is impossible for a singular instance
while length > 0xFFFF:
if isrle: patch += count.to_bytes(3, "big")+b"\x00\x00\xff\xff"+target[count].to_bytes(1, "big")
else: patch += count.to_bytes(3, "big")+b"\xff\xff"+target[count:count+0xFFFF]
count += 0xFFFF
length -= 0xFFFF
#if data was not a multiple of 0xFFFF
if length:
if isrle: patch += count.to_bytes(3, "big")+b"\x00\x00"+length.to_bytes(2, "big")+target[count].to_bytes(1, "big")
else: patch += count.to_bytes(3, "big")+length.to_bytes(2, "big")+target[count:count+length]
count += length
#return data
return b"PATCH"+patch+b"EOF"
This is the best ips construction code in terms of minimal output and is very optimized.
def makepatch(basefile,targetfile,outfile):
def get(File):
with open(File,"rb") as f:
return f.read()
with open(outfile,"wb") as f:
f.write(build(get(basefile),get(targetfile)))
Why do we sometimes use other patching filetypes?
bps for example, uses variable width offsets, and instead of immediate replacement it uses “actions” to move the data and perform selective “range” overwrites in order to achieve a goal with variable scope. ips has a reach of 16,842,750 bytes, however a true legal ips could not write beyond the 24 bit maximum and therefore the maximum reach is truly 16,777,215 bytes.
ips also is horribly inefficient at patching large files, some files may contain duplicates of the base code, which is not just horribly inefficient but also provides a security risk for the original file contents. A simple ips integrity checker could be constructed to compare base contents to patch contents to see what resemblance there is .
In conclusion, ips is designed for an older generation of consoles that were small and simplistic, as the scope of technology gradually increases we may see bps become irrelevant. Currently, and for much time, it is irrational to assume that bps can be made redundant however as it can reach up to a theoretical 2 exabytes in reach.
Why do we still use ips if better filetypes exist?
Easiest question of them all, ips was just there when it needed to be. Because of ips’s common usage and popularity when ROMhacking was more niche than it was the filetype has been the face of early ROMhacking, ips is actually quite space efficient for most of these hacks, it fit’s its scope perfectly.
In some cases, you may opt for bps over ips if the scope of the project would benefit from it, however for minor edits within the size of the base file there is commonly zero reason not to choose ips unless the file you are modding requires a higher reach.
Why should I use patchlib over ips.py?
The main reason you should choose patchlib over ips.py is because it does what ***every** other advanced patching tool does*. After being passed the raw contents of an ips or initialising a blank canvas, patchlib offers total control of the ips. Each instance (diff) has the size, data, rle flag, and diff-reach stored in the instance class as well as a name attribute which can be used to annotate an ips.
The benefit to all of this is that now we can smartly interact with the instances, we can access them with a variety of functions such as get, range``or by accessing the ``instances attribute within the ips class which stores each instance by order of offset. We can also modify the individual instance with the modify method.
Moreover, the project is being actively worked on - and updates and new features should be expected. The code exceeds all known IPS tools and is not even at a release build yet, and it has full docs on the PyPI and active developers in immediate contact on the Discord!
Should I make my own ips handling tool?
There is very minimal reason to do this. As it stands, even when ips filetypes are being manipulated at a deep level, the tools provided are often not even fully used as rarely does the user exceed common building and applying. There is generally a surplus of tools, should you create your own ips tool there should be a reason for this, patchlib’s existence is to provide total control in a Python 3.
JIPS is forgivable as it runs in a Java runtime, meaning that it can run on devices that do not support Python 3. Because JIPS uses Java, the whole ideology being that it can run in any environment, this tool is very helpful to those who do not have an Operating System which any dedicated tool can support. The same would go for ips.py if patchlib did not render it redundant.
If you wish to make a tool, ensure that the benefits are not found immediately in someone else’s tools alone. Once you can confirm there is a point to doing this baring scope, usability and cause, making an ips handler makes complete sense.
Can I contribute towards patchlib?
Yes! patchlib GitHub allows for forks to be made and anyone with some Python skill can be included in the Project! In fact, there are many elements of the project left totally untouched that you could begin working on! If you are interested feel free in contacting on the Discord!
Is it not better just to make your own filetype?
This should be overall somewhat discouraged for these reasons:
ipsis standardized, people may not want to use your files/toolsIt is quite likely that
bpscould solve this, people will use that insteadIt creates some sort of proprietary sense to it, which may deter users.
If tool sharing is too slow for demand, users may share original files
If people do not want to use your tools then the project’s popularity will be stunted, if people construct a bps between the base and result file then nobody will feel obliged to use your format or tool. In the world of common base files it is natural to assume a universal format for manipulation, for this we opt for universal filetypes, limiting control only works for immediate distribution.