|
Oktatás * Programozás 1 * Szkriptnyelvek Teaching • Programming 1 (BI) Félévek Linkek * kalendárium |
Nim2 /
MD5 hashThe MD5 (Message Digest 5) algorithm produces a 128-bit (16-byte) hash value, typically expressed as a 32-digit hexadecimal number. It was designed in 1991 and is commonly used to verify data integrity by generating a unique "fingerprint" for files or strings of text. That is, it is often used as a checksum to verify data integrity against unintentional corruption. However, MD5 is now considered cryptographically broken and insecure. It is no longer suitable for security-sensitive applications such as password storage or digital signatures, and modern systems should use more secure alternatives like SHA-256 or SHA-3 for cryptographic purposes. Here we just want to generate a fingerprint for a file or a string. For this purpose, MD5 is fine. See src/checksums/md5 . nimble install checksums import checksums/md5 echo getMD5("") # d41d8cd98f00b204e9800998ecf8427e echo getMD5("nim") # 51aaf9dbcf1c573b12b329a5668ec05a let fname = "tree.jpg" echo getMD5(readFile(fname)) # md5 hash of the file
MD5 hash of a huge fileIf you have a huge file then it's a better idea to read it by chunks. import checksums/md5 proc md5File(path: string, chunkSize = 1024 * 1024): string = ## Computes the MD5 hash of a file by reading it in chunks. ## Default chunk size is 1 MB. var ctx: MD5Context md5Init(ctx) var f = open(path, fmRead) defer: f.close() var buf = newSeq[uint8](chunkSize) while true: let bytesRead = f.readBytes(buf, 0, chunkSize) if bytesRead == 0: break md5Update(ctx, buf.toOpenArray(0, bytesRead - 1)) var digest: MD5Digest md5Final(ctx, digest) result = $digest # ---------- let fname = "ubuntu.iso" echo md5File(fname) # 725e0a5bf98d2b5c9c0f13d8c38cae79 Some speed comparisons: # Linux command: $ time md5sum ubuntu.iso # 3.44 sec # Nim, DEBUG mode: $ nim c bigfile.nim $ time ./bigfile ubuntu.iso # 94.61 sec # Nim, release mode: $ nim c -d:release bigfile.nim $ time ./bigfile ubuntu.iso # 5.67 sec # Nim, speed mode with GCC: $ nim c -d:release --opt:speed bigfile.nim $ time ./bigfile ubuntu.iso # 5.69 sec # Nim, speed mode with Clang: $ nim c --cc:clang -d:release --opt:speed bigfile.nim $ time ./bigfile ubuntu.iso # 4.75 sec The size of These are hot runs, i.e. the file Notice how slow the DEBUG BUILD is! Lesson learned: when you use a Nim program in production, always compile it in release mode! Summary: the Linux command I also tested what happens if I read the whole content with In Pythonimport hashlib def md5_file(path: str, chunk_size: int = 1024 * 1024) -> str: h = hashlib.md5() with open(path, "rb") as f: while chunk := f.read(chunk_size): h.update(chunk) return h.hexdigest() print(md5_file("ubuntu.iso")) # 725e0a5bf98d2b5c9c0f13d8c38cae79 Execution time: 3.70 sec. Surprisingly fast! It's just as fast as the Linux command I asked Claude AI why Python is so fast in this case. Got the following response: "CPython's
Nim's That makes sense. A pure-Python implementation would be very slow. See also https://docs.python.org/3/library/hashlib.html. OpenSSL is mentioned several times. It's very likely that your Python is compiled with OpenSSL. Here is the Nim implementation: link. It's just pure Nim code. |
![]() Blogjaim, hobbi projektjeim * The Ubuntu Incident [ edit ] |