International Patching System
What is IPS?
The IPS
or International Patching System
filetype was created in 1993 to express “diffs” between a controlled base
file and a resultant target
file. MS-DOS at this point was the most popular operating system and some versions could not store a 32 bit
number due to limitations of the time.
The IPS
filetype was developed by Japanese “ROM-Hackers” who wished to share their modifications in a way that did not promote or commit acts of digital piracy, these files were spread on file hosting/shareware and forum websites before mainstream ROM-Hacking communities were established such as Super Mario World Central which stores the largest archive of Super Mario World ROM-Hacks, and is the most popular use for this filetype.
- Popular IPS tools include:
Lunar IPS is the most popular, but does not support BPS
like Floating IPS does. All the above are designed for Windows Systems except EWing IPS Patcher which should work on any Posix
System.
IPS is designed for 90’s Windows Operating Systems such as MS-DOS, IPSMac is designed for early Macintosh systems, and JIPS is simply an ips
handler designed in Java.
- As well as more developmental IPS tools:
IPS Peek and Chief-Net IPS both allow selective patching, meaning that parts of the IPS may be excluded or included when patching by the users choice. ips.py is a Python ips
tool, however it is not as versatile as patchlib.ips
which is the recommended Python IPS Module.
ROM Patcher JS, however eradicates the usage of executable “diff” appliers/makers as the tool is made entirely in JavaScript and therefore can use mod files and base files in a browser. (The Exception being xdelta
files that use Xdelta3
format which requires an x64/ARM environment that most web services cannot offer)
How does it work?
Simple, an IPS file has a static Header reading PATCH
and a Footer reading EOF
. The middle of the IPS file is a series of unterminated series of bytes. These “instances” come in two forms in variable size.
Example of a noRLE Instance:
01 23 45 00 05 67 89 AB CD EF
To make that easier to read, let’s break it down.
01 23 45 | 00 05 | 67 89 AB CD EF
The first three bytes is a 24 bit
number for the target offset
, the next two bytes is a 16 bit
number for the size of the data. The final series of bytes is the data of which it’s length is described by the 16 bit
size number.
Example of an RLE Instance:
01 23 45 00 00 FF FF 64
The first three bytes are still the 24 bit
target offset
however you may notice the 16 bit
size bytes are equal to zero, indicating that the data is zero in length. However, when the size is equal to zero we treat this is an RLE flag
indicating that this instance
behaves differently.
We read past the two size bytes and take the next two bytes
as a 16 bit
number describing the hunk length
of the RLE
instance. The last byte is the hunk data
and will be repeated until it’s length is equal to the 16 bit
“hunk length”.
Spaces in-between offsets beyond the size of the original file size are expected to be zeroes, drastically reducing file size and likely complexity too leading to faster handling. When not storing past the original file the space in between offsets stores the original file contents which are not included in the ips
for both efficiency and integrity.
How to read/apply one?
Here it will be demonstrated in very simple Python, but annotated well even when the code is verbose.
def apply(base : bytes, patch : bytes) -> bytes:
patch = patch[5:-3] #trim header and footer
changes = {} #dictionary to store diffs
count = 0 #create variable used to track progress in "patch"
while count != len(patch): #Until we have read the last byte
offset = patch[count:count+3] #read next 3 bytes
offset = int(offset.hex(),16) #convert to 24 bit integer
count += 3
size = patch[count:count+2] #read next two bytes
size = int(size.hex().16) #convert to 16 bit integer
if not size: #if RLE flag set
count += 2
size = patch[count:count+2] #Acces RLE Length bytes
size = int(size.hex(),16) #convert to 16 bit number
count += 2
data = patch[count:count+1]
changes[offset] = (size,data) #Store RLE instance with offset as key
else:
data = patch[count:count+size]
changes[offset] = data #Store noRLE instance with offset as key
count += size
output = b""
for offset in changes:
if offset < len(base): #if we are still overwriting
output += base[len(output):offset] #Copy base until diff start
else:
output += b"\x00" * (offset-len(base)) #Write zeroes until diff start
if isinstance(changes[offset],tuple):
output += changes[offset][0]*changes[offset]*1
else: output += changes[offset]
output += base[len(output):] #if we have not wrote up to base, then do so
return output
The code above accepts two bytes objects and will return ` byets` object which could be parsed into a file object. If you only needed this data for patching then you could :
def patchfile(modfile,basefile,outfile):
def get(File):
with open(File,"rb") as f:
return f.read()
with open(outfile,"wb") as f:
f.write(patch(get(base),get(mod)))
However as ipsluna is a module, usage is determined by the user and therefore despite the applications beyond standard usage being nothing short of eccentric does not invalidate the intentions. This is where iplsuna exceeds ips.py.
How does `ips` building work?
ips constructing is much more detailed than ips applying, as we have to account for the following things:
ips
files should contain minimal original data.*ips
files should not attempt to make an impossibly large file.**ips
files should preferrle
unless setup is too costly.***ips
files must write to the last byte of the new file if bigger , even if zero.
*
This does not mean that it won’t work, it just means that you may end up creating an unnecessarily large file that contains potentially sensitive data
**
By default in `patchlib` it is set to `16,777,215 bytes` ( 16.7 MB) however `ips` may reach up to `16,842,750 bytes` by setting `legacy` to `False`
***
This is merely optimization, no `ips` has to contain `rle` however it should be noted that it is only optimal if the `rle` is of length `9` or higher.
Now that you know the rules, we can begin to create an ips
file.
def build(base : bytes, target : bytes) -> bytes:
patch,count = b"", 0
#Lambdas for operation viability checks
viability = lambda offset, dist: target[offset].to_bytes(1, "big")*dist == target[offset : offset + dist]
compare = lambda offset: (base[offset] != target[offset]) if offset < len(base) else True
def rle(): #function for processing rle data
length = 9
while compare(count + length) and count + length < len(target) and viability(count, length): length += 1
return length - 1
def norle(): #function for processing rle unviable data
length = 1
while compare(count + length) and count + length < len(target) and not (viability(count + length, 9) and all(compare(count + length + r) for r in range(9))): length += 1
return length
#while we have not compared the final byte
while count < len(target):
#if we are comparing the final byte
if count == len(target)-1:
patch += count.to_bytes(3, "big")+b"\x00\x01"+target[count].to_bytes(1, "big")
count += 1
#if we have unncessary data
elif base[count] == target[count] if count < len(base) else target[count] == 0:
while (base[count] == target[count] if count < len(base) else target[count] == 0) if count < len(target) - 1 else False: count += 1
#now that we have our diff
else:
#determinte rle viability
isrle = viability(count, 9) and all(compare(count + r) for r in range(9))
length = [norle,rle][isrle]() #retrieve length to store
#while length is impossible for a singular instance
while length > 0xFFFF:
if isrle: patch += count.to_bytes(3, "big")+b"\x00\x00\xff\xff"+target[count].to_bytes(1, "big")
else: patch += count.to_bytes(3, "big")+b"\xff\xff"+target[count:count+0xFFFF]
count += 0xFFFF
length -= 0xFFFF
#if data was not a multiple of 0xFFFF
if length:
if isrle: patch += count.to_bytes(3, "big")+b"\x00\x00"+length.to_bytes(2, "big")+target[count].to_bytes(1, "big")
else: patch += count.to_bytes(3, "big")+length.to_bytes(2, "big")+target[count:count+length]
count += length
#return data
return b"PATCH"+patch+b"EOF"
This is the best ips
construction code in terms of minimal output and is very optimized.
def makepatch(basefile,targetfile,outfile):
def get(File):
with open(File,"rb") as f:
return f.read()
with open(outfile,"wb") as f:
f.write(build(get(basefile),get(targetfile)))
Why do we sometimes use other patching filetypes?
bps
for example, uses variable width offsets, and instead of immediate replacement it uses “actions” to move the data and perform selective “range” overwrites in order to achieve a goal with variable scope. ips
has a reach of 16,842,750 bytes
, however a true legal ips
could not write beyond the 24 bit
maximum and therefore the maximum reach is truly 16,777,215 bytes
.
ips
also is horribly inefficient at patching large files, some files may contain duplicates of the base code, which is not just horribly inefficient but also provides a security risk for the original file contents. A simple ips
integrity checker could be constructed to compare base contents to patch contents to see what resemblance there is .
In conclusion, ips
is designed for an older generation of consoles that were small and simplistic, as the scope of technology gradually increases we may see bps
become irrelevant. Currently, and for much time, it is irrational to assume that bps
can be made redundant however as it can reach up to a theoretical 2 exabytes
in reach.
Why do we still use ips
if better filetypes exist?
Easiest question of them all, ips
was just there when it needed to be. Because of ips
’s common usage and popularity when ROMhacking was more niche than it was the filetype has been the face of early ROMhacking, ips
is actually quite space efficient for most of these hacks, it fit’s its scope perfectly.
In some cases, you may opt for bps
over ips
if the scope of the project would benefit from it, however for minor edits within the size of the base file there is commonly zero reason not to choose ips
unless the file you are modding requires a higher reach.
Why should I use patchlib
over ips.py
?
The main reason you should choose patchlib
over ips.py
is because it does what ***every** other advanced patching tool does*. After being passed the raw contents of an ips
or initialising a blank canvas, patchlib
offers total control of the ips
. Each instance (diff) has the size
, data
, rle flag
, and diff-reach
stored in the instance
class as well as a name
attribute which can be used to annotate an ips
.
The benefit to all of this is that now we can smartly interact with the instances, we can access them with a variety of functions such as get
, range``or by accessing the ``instances
attribute within the ips
class which stores each instance
by order of offset
. We can also modify the individual instance
with the modify
method.
Moreover, the project is being actively worked on - and updates and new features should be expected. The code exceeds all known IPS tools and is not even at a release build yet, and it has full docs on the PyPI and active developers in immediate contact on the Discord!
Should I make my own ips
handling tool?
There is very minimal reason to do this. As it stands, even when ips
filetypes are being manipulated at a deep level, the tools provided are often not even fully used as rarely does the user exceed common building
and applying
. There is generally a surplus of tools, should you create your own ips
tool there should be a reason for this, patchlib
’s existence is to provide total control in a Python 3
.
JIPS
is forgivable as it runs in a Java runtime, meaning that it can run on devices that do not support Python 3
. Because JIPS
uses Java
, the whole ideology being that it can run in any environment, this tool is very helpful to those who do not have an Operating System which any dedicated tool can support. The same would go for ips.py
if patchlib
did not render it redundant.
If you wish to make a tool, ensure that the benefits are not found immediately in someone else’s tools alone. Once you can confirm there is a point to doing this baring scope, usability and cause, making an ips
handler makes complete sense.
Can I contribute towards patchlib
?
Yes! patchlib
GitHub allows for forks to be made and anyone with some Python
skill can be included in the Project! In fact, there are many elements of the project left totally untouched that you could begin working on! If you are interested feel free in contacting on the Discord!
Is it not better just to make your own filetype?
This should be overall somewhat discouraged for these reasons:
ips
is standardized, people may not want to use your files/toolsIt is quite likely that
bps
could solve this, people will use that insteadIt creates some sort of proprietary sense to it, which may deter users.
If tool sharing is too slow for demand, users may share original files
If people do not want to use your tools then the project’s popularity will be stunted, if people construct a bps
between the base and result file then nobody will feel obliged to use your format or tool. In the world of common base files it is natural to assume a universal format for manipulation, for this we opt for universal filetypes, limiting control only works for immediate distribution.