LCF-AT Posted February 21, 2023 Posted February 21, 2023 Hi guys, don't remember anymore whether I did ask about it in past already (can not find post) so I need to ask now about it. A while ago I did already notice that I got trouble with filenames which using specific chars like symbols in the name. Mostly you can see that on youtube title names of any video using funny symbols etc, The problem is that I was just using normal ASCII functions in my MASM codes and I can not handle those symbol chars etc you know. Now I would like to fix that but how? Example: How to get the entire correctly filename (with symbols) on WM_DROPFILES message / edit controls etc? How to read those strange filename and how to handle them? Filename 💘 With Strange 💌 Symbols.file Need to handle that name above and export this name too later like this.. Filename 💘 With Strange 💌 Symbols_Output.file All in all I just forgot it (think so) or don't remember anymore but need to know it now to fix those problems. Maybe you can help quickly. Thanks. greetz 1
NewLearner Posted February 22, 2023 Posted February 22, 2023 Hello LCF , declare the string as dw and use unicode version of the winapi i guess they can handle filenames which contains emoji (I didn't try it) . NL. 2
NOP Posted February 22, 2023 Posted February 22, 2023 (edited) You could either handle the unicode or strip out non ansi chars. Personally if it wasn't in a different language then I would strip them out / change them because emojis for example will only work on the site you are getting the files from and won't be shown in your filename, even if you handle unicode Edited February 22, 2023 by NOP 1
LCF-AT Posted February 22, 2023 Author Posted February 22, 2023 Hi guys, maybe some words from me about my sources in MASM. Normaly I just use ASCII functions (A) instead of (W) functions since I did started coding. Yesterday I did remember anything what fearless said in the past about using a specific command in MASM source to make the app compiled in Unicode mode. I found this command... __UNICODE__ EQU 1 ...I have to set directly where I have stored my inc/lib paths and before .const / .data / .code. So I did set this command in my source and tried to compile and got some error because in my source I also using some marcos like "chr$" to set strings directly. So I did remove those macros and set the strings under .data. After those changes the exe was compiled and did run but not correctly and also my LISTVIEW is no more to see anymore and other things etc = all in all not working normally now with those changes. I made a backup of my normal ASCII source so that I can use this again. So the main question is how to handle those UNICODE or SYMBOL mixed filenames? @NewLearner So you mean I have to set the string like this... .data string_1 db '$url',0 string_2 db '$name',0 string_3 db '$path',0 string_4 db '$ext',0 to this? .data string_1 dw '$url',0 string_2 dw '$name',0 string_3 dw '$path',0 string_4 dw '$ext',0 ....dw = dword. What should this bring? @NOP Ok, how to strip out those special chars? Also in this case the question is how to get / use the direct filepath? If I strip out the symbols from name then I can not access this file anymore because the filepath dosen't match. Example: I do drag & drop any file which has some special chars into my app and this path / filename I need to use for another processes I want to run like ffmpeg and others. So I have already the problem reading the filename IF it has those special chars and I fail already on WM_DROPFILES / DragQueryFile function. Just wonder how to make it work to handle those filenames. Do you guys have any tiny example source / exe where you can drag a file into & output this file as new file etc? Each time if I have to work with those strange filename my apps do fail to read them and I need to rename those files whats really bad. Just need to find any working solution for this. greetz
LCF-AT Posted February 22, 2023 Author Posted February 22, 2023 PS: By the way, I did check my new compiled file in Olly and see that only my own strings are changed from ASCII to UNICODE but all functions are still calling the ASCII (A) version instead of W. Why? Should this command "__UNICODE__ EQU 1" not also change the function too?
NOP Posted February 22, 2023 Posted February 22, 2023 Can't you strip out the characters before saving, so saving as new filename and then any processing you need to do is with the saved file with new filename? Or do you process directly via url? in which case you have no option but to use original as it is and need to use unicode
LCF-AT Posted February 22, 2023 Author Posted February 22, 2023 Somehow I have to read the file first (CreateFile etc) but how? At the end I also wanna use the real filename as save file xy. No idea how to handle that stuff.
LCF-AT Posted February 23, 2023 Author Posted February 23, 2023 Hi again, in the MASM help file I found this info... Unicode Support The MASM32 include files now have system wide unicode support by the inclusion of the equate __UNICODE__ at the start of the source code BEFORE any include files. This factor is critical as the include files need to know which prototype system is being used . The form of the equate is as follows, __UNICODE__ equ 1 The Windows API prototypes in the include file occur in this form, AddAtomA PROTO STDCALL :DWORD IFNDEF __UNICODE__ AddAtom equ <AddAtomA> ENDIF AddAtomW PROTO STDCALL :DWORD IFDEF __UNICODE__ AddAtom equ <AddAtomW> ENDIF If you define the equate __UNICODE__ the UNICODE form of the API is provided in source code, if the equate is not defined the ASCII form of the API is provided. NOTE that all existing ASCII code written before unicode support work as they have always worked, the inclusion of the __UNICODE__ equate offers the additional option of coding directly in unicode without having to use the direct API name with the trailing "W". ...so I did set that command at the top of my source __UNICODE__ equ   1 but it does not create W functions and still using A functions. Does anyone know how to make it work? The compiler options in my project in WInASM are these... Assemble=/c /coff /Cp /nologo Link=/SUBSYSTEM:WINDOWS /RELEASE /VERSION:4.0 ...maybe anyone could give some advise. greetz
jackyjask Posted February 24, 2023 Posted February 24, 2023 18 hours ago, LCF-AT said: ..so I did set that command at the top of my source __UNICODE__ equ   1 but it does not create W functions and still using A functions It does work in case of using MASM64, eg: https://prnt.sc/XxQmleBuxPZb 1
LCF-AT Posted February 24, 2023 Author Posted February 24, 2023 Hi @jackyjask, I'am not using MASM64. Just using x86. \masm32\Bin \masm32\include \masm32\Lib Something isn't working right. I have a source I tried using that command __UNICODE__ EQU 1 and then the most ASCII strtings was changed to UNICODE strings but no API was changed... 004013CC MOV DWORD PTR DS:[47D5D4],OFFSET ??00EA ; UNICODE "Name" <----- 004013D6 MOV DWORD PTR DS:[47D5D0],7E 004013E0 PUSH OFFSET lvc ; /lParam = 47D5C8 004013E5 PUSH 1 ; |wParam = 1 004013E7 PUSH 101B ; |Message = MSG(101B) 004013EC PUSH DWORD PTR DS:[LISTVIEW] ; |hWnd = NULL 004013F2 CALL _OpenClipboard@4 ; \SendMessageA <---- Still A not W 0045F2EC=OFFSET ??00EA (UNICODE "Name") DS:[0047D5D4]=00000000 bones.asm:611. mov lvc.pszText,chr$("Name") <---- ....and in my other test sources it does change nothing when using the command __UNICODE__ EQU 1. No idea what the problem is. Have I to use some specific include files? Example: __UNICODE__ EQU 1 .data SOMETEST db "StringHere",0 <--- no changes in of these strings!? SOMETEST2 db 'StringHere2',0 CMD_1 db "cmd.exe /k %s",0 CMD_2 db 'cmd.exe /c %s',0 The strings was not changed from ASCII to UNICODE ----------------------------------------------------------- 00402B5A PUSH 0045F0A9 ; |Format = "cmd.exe /k %s" 00402B5F PUSH EDI ; |s = bones.<ModuleEntryPoint> 00402B60 CALL <JMP.&user32.wsprintfA> ; \wsprintfA 00402B65 ADD ESP,0C 00402B68 JMP SHORT 00402B7D 00402B6A PUSH 0045F0A7 ; /<%s> = """ 00402B6F PUSH 0045F0B7 ; |Format = "cmd.exe /c %s" 00402B74 PUSH EDI ; |s = bones.<ModuleEntryPoint> 00402B75 CALL <JMP.&user32.wsprintfA> ; \wsprintfA Now, otherwise if I use the chr$ marco with '' or with "" then I get this results... invoke SendMessage,esi,EM_REPLACESEL,FALSE,chr$("$ext") <-- does change to UNICODE invoke SendMessage,esi,EM_REPLACESEL,FALSE,chr$('$ext') <-- does not or print error error A2071: initializer magnitude too large for specified size anyhow strange. So the UNICODE command does just change all ASCII strings in my source which using the chr$ marco but the functions are still A and in the .data sections the string are also still ASCII. So how to make it work now? Can anyone post a tiny example source which I could try to see whether it works etc? greetz
LCF-AT Posted February 24, 2023 Author Posted February 24, 2023 Hi again, here a example.... __UNICODE__ EQU 1 include \masm32\include\masm32rt.inc .data T1 db "123",0 T2 db 'test1',0 .data? .code start: invoke MessageBox,0,chr$("ABC"),addr T2,MB_ICONINFORMATION invoke ExitProcess,eax end start In Olly... 00B31000 <ModuleEntryPoint> PUSH 40 ; /Style = MB_OK|MB_ICONASTERISK|MB_APPLMODAL 00B31002 PUSH 00B33004 ; |Title = "test1" 00B31007 PUSH 00B3300A ; |Text = "A" 00B3100C PUSH 0 ; |hOwner = NULL 00B3100E CALL <JMP.&user32.MessageBoxA> ; \MessageBoxA 00B31013 PUSH EAX ; /ExitCode = 53FDF0 00B31014 CALL <JMP.&kernel32.ExitProcess> ; \ExitProcess 00B31019 INT3 00B3101A JMP DWORD PTR DS:[<&user32.MessageBoxA>] ; user32.MessageBoxA 00B31020 JMP DWORD PTR DS:[<&kernel32.ExitProcess>] ; KERNEL32.ExitProcess Only the chr$("ABC") was changed to unicode. Whats the problem? greetz
jackyjask Posted February 25, 2023 Posted February 25, 2023 OK, masm32 case Installed from official site tried to build unicode example from this dir: c:\masm32\examples\unicode_generic\template\ it has got a   __UNICODE__ equ 1      ; uncomment to enable UNICODE build at the very top of  template.asm unicode build:   Now, if you comment out that line like  ;   __UNICODE__ equ 1      ; uncomment to enable UNICODE build  you got ...A (x86) build, eg:   so not sure what's going on your side,sorry... until you share your project or maybe try to get vanilla masm32 and try again?   1
LCF-AT Posted February 25, 2023 Author Posted February 25, 2023 Hi @jackyjask, thanks for the info. So I tried to compile that template source and I get also just A function out. Now I tried to install MASM (masm32v11r.zip) in Sandbox of Windows + WinASM and did compile any dialog with  __UNICODE__ equ 1 at the top and messagebox API and there it works and it does create W function. Hhmm! But strings like this.. .data T1 db "TEST",0 T2 db "CAP",0 ...are not changed to unicode and still in ASCII. My question here is, how to deal with strings in sections (.data) and directly in sources? Which string get changed to unicode and which not? Are there any rules etc? I also get another problem. After installing in Windows SB the compiler command /DYNAMICBASE:NO is no more working! Why? /SUBSYSTEM:WINDOWS /RELEASE /VERSION:4.0 /DYNAMICBASE:NO "/LIBPATH:\Masm32\Lib" "C:\WinAsm\Templates\Dialog\bones\bones.obj" "C:\WinAsm\Templates\Dialog\bones\bones.res" "/OUT:C:\WinAsm\Templates\Dialog\bones\bones.exe" LINK : warning LNK4044: unrecognized option "DYNAMICBASE:NO"; ignored PS: Will try to install MASM on my main OS fresh. greetz
LCF-AT Posted February 25, 2023 Author Posted February 25, 2023 EDIT: So I installed that MASM SDK on my main OS too now. I get also that error about DYNAMICBASE on compiling now. Otherwise the functions do change now to W but the strings keep same in section....why? Look.. __UNICODE__ EQU 1 .486 ; create 32 bit code .model flat, stdcall ; 32 bit memory model option casemap :none ; case sensitive include \masm32\include\windows.inc ; main windows include file include \masm32\macros\macros.asm ; masm32 macro file include \masm32\include\masm32.inc include \masm32\include\user32.inc include \masm32\include\kernel32.inc includelib \masm32\lib\user32.lib includelib \masm32\lib\kernel32.lib ; This strings keep ASCII .data T1 db "123",0 T2 db 'test1',0 .data? buf dd ? .code start: lea eax, chr$("ANI") <-- does change to unicode lea eax, chr$("ANI2") <-- does change to unicode ; lea eax, chr$('ANI3') <-- not working to compile if __UNICODE__ is set invoke MessageBox,0,offset T1,offset T2,MB_ICONINFORMATION invoke MessageBox,0,chr$("Test"),chr$("Caption"),MB_ICONINFORMATION invoke ExitProcess,eax invoke wsprintf,addr buf,addr T1,eax invoke wsprintf,addr buf, chr$("123 %s"),eax end start ...the API do change to W = OK and the strings set with macro chr$("XY") also change to unicode but not the strings in section what means that W function do use ASCII strings. How to change the ASCII string in .data section all at once to unicode? Is there any command I can set without to change every single string? Below Olly... 00401000 <ModuleEntryPoint> LEA EAX,DWORD PTR DS:[??0019] 00401006 MOV EDI,EDI ; SINA.<ModuleEntryPoint> 00401008 LEA EAX,DWORD PTR DS:[??002C] 0040100E PUSH 40 00401010 PUSH OFFSET T2 ; ASCII "test1" 00401015 PUSH OFFSET T1 ; ASCII "123" 0040101A PUSH 0 0040101C CALL <JMP.&user32.MessageBoxW> 00401021 LEA ECX,DWORD PTR DS:[ECX] 00401024 PUSH 40 00401026 PUSH 0040302C ; UNICODE "Caption" 0040102B PUSH 00403020 ; UNICODE "Test" 00401030 PUSH 0 00401032 CALL <JMP.&user32.MessageBoxW> 00401037 PUSH EAX 00401038 CALL <JMP.&kernel32.ExitProcess> 0040103D PUSH EAX 0040103E PUSH OFFSET T1 ; ASCII "123" 00401043 PUSH 00403050 00401048 CALL <JMP.&user32.wsprintfW> 0040104D ADD ESP,0C 00401050 _start PUSH EAX 00401051 PUSH 0040303C ; UNICODE "123 %s" 00401056 PUSH 00403050 0040105B CALL <JMP.&user32.wsprintfW> 00401060 ADD ESP,0C 00401063 INT3 00401064 JMP DWORD PTR DS:[<&user32.MessageBoxW>] ; user32.MessageBoxW 0040106A JMP DWORD PTR DS:[<&user32.wsprintfW>] ; user32.wsprintfW 00401070 JMP DWORD PTR DS:[<&kernel32.ExitProcess>] ; KERNEL32.ExitProcess greetz
jackyjask Posted February 26, 2023 Posted February 26, 2023 Good progress! regarding >; This strings keep ASCII .data T1 db "123",0 T2 db 'test1',0 yeah, thats true... there is no any auto-magic that will convert ASCII strings defined as db (define byte) into unicode ones... you have to manually replace all your db statements into some macros for unicode strings... or you could do even more hardcore: add 0 between chars, eg: Unicode db 55h,0,6Eh,0,69h,0,63h,0,6Fh,0,64h,0,65h,0,0,0 but this is very brutal way, I don't like it   Also insdie masm32 help files *.chm you coud find a lot of info about asm macroses... eg: sasAssign a string to a LOCAL variable. __UNICODE__ aware. cstCopy one zero terminated string to another. __UNICODE__ aware. also this:  File location: c:\masm32\help\hlhelp.chm 1
jackyjask Posted February 26, 2023 Posted February 26, 2023 (edited) Upd regarding /DYNAMICBASE linker option just run link.exe /? and check it out if you see that option, if not - you are using very old MS linker you have to "borrow" some more modern files from VS eg: c:\VS2019\VC\Tools\MSVC\14.29.30133\bin\Hostx86\x86\link.exe  Edited February 26, 2023 by jackyjask 1
LCF-AT Posted February 26, 2023 Author Posted February 26, 2023 Hi @jackyjask, thanks for trying to help me with that problem. So to change all ASCII strings in my sections + all other added ASM / Included files sounds like horror! Somehow a bad idea to manage that manually. Just will try this. By the way, so makes it more sense not ONLY to code / compile in UNICODE mode instead of ASCII? How to handle now those UNICODE string? Do you have a list of marcos & functions I can use instead of ASCII macros? I see there are also some issues using some UNICODE macro names like WSTR = problem sometimes instead using UCSTR etc. Do you have some helpfully infos which could help me to handle all in UNICODE style correctly? As I said, I was just only using normal ASCII style. I did check out the linker version / support and see that the MASM linker version 5.12 does not support that command. Also found another linker files on my HDD from here.. C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\bin\Hostx86\x86\ ...what has a linker version 14.29 with many more commands I can use. Just did copy that linker.exe + some dll files (mspdb140.dll, tbbmalloc.dll) into my MASM bin folder and now it works to compile it. greetz
jackyjask Posted February 27, 2023 Posted February 27, 2023 Hello! well, ideally I guess you need to wrap all your old good ASCII strings into some kind of macros that will be AI enough (hehe) to go either into ASCII or UNICODE depending on the __UNICODE__  defined or not.. so yeah, some manual work will be in need I'm not very much up to asm, but mostly to C/C++, it also has got similar approach, eg:  https://devblogs.microsoft.com/oldnewthing/20040212-00/?p=40643 If you from now on decide to always use unicode - you could start using WSTR macros if you want to be flexible, you could use your own anyway, I encourage you to read the \macros\macros.asm file - it has tons of useful macroses... also don't hesitate to read the official (and very old school) forum, eg: https://masm32.com/board/index.php?topic=2054.msg29631#msg29631  2) regarding linker - someone like Pelles C binaries (asm/linker/res util) - you could grab that from off. site and use in your .bat file eg I like these: but i fyou like MS products - up to you  1
LCF-AT Posted February 6 Author Posted February 6 Hi again, today I was looking again on my test project trying to make my ASCII app into an UNICOCDE app. Still not working and sure whether it would be possible to change everything in my source anyhow etc. Now I have a problem trying to use an API function from shlwapi module called StrFormatByteSize64. So normally it works to compile the file in normal ASCII mode but when I set the "__UNICODE__ EQU 1" then I get an error info... error A2006: undefined symbol : StrFormatByteSize64 ....but why? The inc & lib are declared and inside I can read this... StrFormatByteSize64A PROTO STDCALL :DWORD,:DWORD,:DWORD,:DWORD IFNDEF __UNICODE__ StrFormatByteSize64 equ <StrFormatByteSize64A> ENDIF ...so why does it fail when I use the __UNICODE variable? greetz
jackyjask Posted February 6 Posted February 6 There is just 1 winapi named StrFormatByteSize64A   https://learn.microsoft.com/en-us/windows/win32/api/shlwapi/nf-shlwapi-strformatbytesize64a 2
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now