MBF(Microsoft Binary Format) to IEEE convertion
If you want to read this article in Spanish, then click here.
Google has it all, and if he doesn't have it, probably doesn't exists.
That's what I encountered when I tried to solve this problem: Convert the MBF format to IEEE floating point standard.
When I was trying to found documentation about this issue, i found a lot of people asking the same question:
¿How I translate bits between these formats?
And the most of the cases, several failing algorithms were posted in the internet, guessing the convertion and especulating about
the format between these two types of floats.
¿Why do we need to know more about this deprecated Microsoft format? It's an easy answer: because there is so much software that relies on the use of this floating point format to work and in my case, the QWK offline messages packet format use these floats to indexing messages. (Me, working in QWK is another story).
So, here I was, trying hardly to figure out how the fuck I convert these bits, and for the first place, I tried the scheme published in a old document of this QWK format in 1992.
QWK Mail Packet File Layout
by Patrick Y. Lee
[...]
Microsoft binary (by Jeffery Foy):
31 - 24 23 22 - 0 <-- bit position
+-----------------+----------+
| exponent | sign | mantissa |
+----------+------+----------+
IEEE (C/Pascal/etc.):
31 30 - 23 22 - 0 <-- bit position
+----------------------------+
| sign | exponent | mantissa |
+------+----------+----------+
[...]
So, I tried differents approach to convert these bytes, and I strongly refused to use C to solve this issue, because most of the work on QWK processing is in python, and bind a C function to python wasn't my first option at all.
Desperately failing every time in this convertion, I managed to be very close to solve this scheme but, 4 bits were always wrong.
Researching even more in google, I found a concept in a snippet of code at markcoder site regarding to MBF --> IEEE convertion. A 'Magic Number'.
I was like: Ok, that's it. Fuck magic. There is no magic in computers. I refuse to believe in variables called 'blackMagic' or 'magicMask' or 'fuckingMagicWhateverItThisShit' (but, during the code of this function, i wrote several times this kind of vars during my trial and error period of coding).
So I remembered one sets of libraries called 'Stamina' for vb6 that mostly of the times, ages ago, i was using to do some dirty work in Visual Basic 6 and I found MBFIEE32.DLL with a function within: DxToIEEEs. My function. THE function.
And later, I was running this little program in VB6 in a virtualbox windows xp.
Ok. I got the correct numbers. The float rendered as an integer, just how I expected and the library was just doing what I needed.
So I started to dissasemble the segment of code that do the "black magic".
After analysing the asm code, I finally wrote a working convertion in python.
I used BitString library to do the bit handling.
import struct
from bitstring import BitString
class Utils:
def StaminaMSBToIeeeDissasembled(self, pBytes):
""" This function was reverse engineered from MBF2IEE.DLL
from vb6 stamina libraries
CPU Disasm
-------------
MBFIEEE32 --> Function: DxToIEEEs (Stamina Lib for vb6)
-------------
Address Hex dump Command Comments
10001070 /$ 55 PUSH EBP ; C Convention
10001071 |. 8BEC MOV EBP,ESP
10001073 |. 56 PUSH ESI
10001074 |. 53 PUSH EBX ; End of C Convention
10001075 |. 8B75 08 MOV ESI,DWORD PTR SS:[ARG.1] ; We obtain the argument (my Single datatype)
10001078 |. 66:8B5E 02 MOV BX,WORD PTR DS:[ESI+2] ; and we get the second byte?
1000107C |. 66:8BCB MOV CX,BX ; we copy bx to cx, we'll need later
1000107F |. 66:33C0 XOR AX,AX ; clear ax
10001082 |. 8AC7 MOV AL,BH ; we copy the second byte to AL
10001084 |. 66:83F8 03 CMP AX,3 ; and we check if this byte is lower than 3
10001088 |. 72 1A JB SHORT ax_is_below_three
1000108A |. 2C 02 SUB AL,2 ; ok, it wasn't, so whe substract 2 to that byte
1000108C |. 86C4 XCHG AH,AL ; and whe save it to AH
1000108E |. 02DB ADD BL,BL ; we add ourselves o.o
10001090 |. 66:D1D8 RCR AX,1 ; rotate 1 bit to right
10001093 |. 66:83E1 7F AND CX,007F ; aha, i've seen this number before, we mask some bytes then
10001097 |. 66:0BC1 OR AX,CX
1000109A |> 66:8946 02 MOV WORD PTR DS:[ESI+2],AX ; yes, we save the result in these bytes.
1000109E |. 5B POP EBX ; and closing C convention, we fucking leave.
1000109F |. 5E POP ESI
100010A0 |. C9 LEAVE
100010A1 |. C2 0400 RETN 4
100010A4 |> 33C0 XOR EAX,EAX
100010A6 |. 8906 MOV DWORD PTR DS:[ESI],EAX
100010A8 \.^ EB F0 JMP SHORT 1000109A
We'll only care about two of them """
#print "Entering MSB ---> IEEE Long Int"
#pBytes = "\x00\xe0\x0f\x8b" # this number is equal to: 1151
ms = BitString(bytes=pBytes, length=32)
msle = ms[24:32] + ms[16:24] + ms[8:16] + ms[0:8]
a = ms[24:32]
b = ms[16:24]
cx = ms[16:24] + ms[24:32]
# we check that first unsigned byte is < 3 intA = int(struct.unpack('B', a.tobytes())[0]) if intA >= 3:
# we do a lot of things.
intA -= 2
a = BitString(uint=int(intA), length=8) # we save the changes to that byte
# now, we do secondByte*2
intB = int(struct.unpack('B', b.tobytes())[0])
intB *= 2
b = BitString(uint=int(intB), length=8) # we save the changes to that byte
# now comes a tricky part.
# in the dissasemble that i've done to the MBF2IEE.DLL (from stamina)
# here comes a rotate that needs 2 bytes to be done.
# so i'll create the final 2 bytes that will be stored in the final ieee convertion
convertionBytes = a + BitString(bytes="\x00", length=8)
convertionBytes = convertionBytes[15] + convertionBytes[0:15] # we rotate them to the right
# now, i need to mask the previously saved byte 'cx' with 0x7f
masked = struct.unpack('H', cx.tobytes())[0] & 0x007f
masked = BitString(uint=masked, length=16)
# and we OR convertionBytes with masked !
tmpResult = (convertionBytes | masked)
# we put things back together
ieee = tmpResult + ms[8:16] + ms[0:8]
i = struct.unpack('>f', ieee.tobytes())[0]
else:
# we do a lot of OTHER things
i = 0 # like return zero ;D
return int(i)
I really hope that this would be helpful to someone out there.
















5 Agosto, 2010 - 20:25
: O