- What causes garbled Japanese text (文字化け) in CSV files?
- Japanese Windows applications like Excel historically default to Shift-JIS encoding when saving CSV files. Modern tools (Python, Google Sheets, macOS) expect UTF-8. When you open a Shift-JIS file in a UTF-8 context, the byte values are misinterpreted and you see gibberish like
縺ゅ■縺・ instead of あいう. This converter reads the bytes with the correct codec and re-saves them as UTF-8.
- How do I know which encoding my file is using?
- If the file came from a Japanese Windows PC and has a .csv extension, it is almost certainly Shift-JIS (also called Windows-31J or CP932). EUC-JP is common in older Linux/Unix systems or databases. UTF-16 is used by Excel on Windows when you choose "Unicode Text (.txt)" as the save format. ISO-2022-JP appears mainly in old email exports. If you are unsure, try Shift-JIS first — it is correct in the vast majority of cases.
- What is the difference between Shift-JIS and CP932?
- CP932 (also called Windows-31J) is Microsoft's extended version of Shift-JIS. It adds characters like the wave dash (〜) and other vendor-specific code points. The browser's
TextDecoder with shift_jis actually decodes CP932, so this converter handles both transparently. If you still see the NEC special characters or IBM extensions missing, try the same Shift-JIS option — it already covers them.
- Can I convert multiple files at once?
- Yes. Select multiple files from the file picker (hold Shift or Ctrl) or drag several files at once onto the drop zone. All files are converted using the same source encoding you selected. You can download them individually or use "Download All" to get every converted file in one go.
- Is my data safe? Does anything get uploaded to a server?
- Nothing is uploaded. The conversion uses the browser's built-in
TextDecoder API (supported in all modern browsers) to decode the bytes locally, then TextEncoder to re-encode as UTF-8. The resulting UTF-8 text is turned into a Blob URL for download — entirely client-side, no network request of any kind.
- What should I choose for the line ending setting?
- For most use cases, "Keep original" is fine. Choose CRLF if you plan to open the UTF-8 CSV in Windows Notepad or older Windows apps. Choose LF for Linux/Mac scripts and modern editors. CR (carriage return only) was used by classic Mac OS 9 and some legacy tools — rarely needed today.
- Does the output UTF-8 file have a BOM?
- No. The converter produces UTF-8 without a byte order mark (BOM). BOM-less UTF-8 is the correct standard and works everywhere. If you specifically need a UTF-8 BOM for Excel to auto-detect the encoding, note that simply opening the file via Excel's "From Text/CSV" import wizard and selecting UTF-8 in the dialog is the more reliable approach.