如何读取UTF-8编码?
大家好我试图编写代码来读取数据。txt文件并填充到Autocad Map中的对象数据表中,但它被显示了????????
https://www.cadtutor.net/forum/attachment.php?attachmentid=63717&d=1523370322&thumb=1&stc=1
云你能帮帮我吗?
测验txt文件
尚塔博里。图纸
填充OD_表。lsp
不确定,但请尝试一下,如果不是一个好的解决方案,很抱歉
Can't display "ດິນບຸກຄົນ" ?
;;;(setq remark (vk_ReadTextStream "C:/test.txt" "UTF-8"))
;Try alternative way manually copy text from text file then paste
(setq remark (getstring "\nPaste our text here -> "))
;"\U+0E94\U+0EB4\U+0E99\U+0E9A\U+0EB8\U+0E81\U+0E84\U+0EBB\U+0E99"
;or dialog
(setq remark (lisped "paste here ") )
谢谢hanhphuc我会试试的 一些亚洲字体可以显示正常的打开功能,但它只支持ANSI
如果初始对为FE FF(十六进制)或254 255
保存测试。txt作为Unicode
(setq f (open path "r"))
(setq ret (read-line f)) ;<--test only 1st line
(if f (close f))
测验
(defun foo ( str ) ; read unicode - test version
hanhphuc 17.04.2018
(apply 'strcat
(mapcar
''( ( x ) (apply 'strcat (vl-list* (chr 92) "U+" (mapcar ''( (x / $)(setq $ ( LM:dec->base x 16))
(if (or (< x 10) (=(strlen $)1)) (strcat "0" $) $) )
(reverse x)
)
)
)
)
(
'( ( f ) (f (vl-remove-if
'(lambda (x) (vl-some '(lambda (y)
(= x y)
)
'( 254 255 ))
)
(vl-string->list str)
)
)
)
'( ( l ) (if l (cons (list (car l)(cadr l))
(f (cddr l)))
)
)
)
)
)
)
;; Decimal to Base-Lee Mac
;; Converts a decimal number to another base.
;; n - decimal integer
;; b - non-zero positive integer base
;; Returns: Representation of decimal in specified base
(defun LM:dec->base ( n b )
(if (< n b)
(chr (+ n (if (< n 10) 48 55)))
(strcat (LM:dec->base (/ n b) b) (LM:dec->base (rem n b) b))
)
)
如果以上内容适用于您的语言,请尝试?
否则方案B:假设您使用UTF-8文件FSO读取流,它更稳定,但如果我有时间的话,很难配对1到4个字节 太棒了
它可以工作
非常感谢hanhphuc
谢谢李的密码
不客气。希望你下次能自己编写代码
如果您对以前的unicode方法有疑问,下面是我的UTF-8函数在将来可能会有用。试试吧,祝你好运。。
(alert (foo ret ) )
ດິນບຸກຄົນ ??
"\U+0E94\U+0EB4\U+0E99\U+0E9A\U+0EB8\U+0E81\U+0E84\U+0EBB\U+0E99"
最后连接所有编码字节列表:
一些屏幕截图
https://i.imgur.com/ooL2kHm.png
随机测试阿拉伯语、汉语、印地语、日语、韩语、老挝语、旁遮普语、俄语、泰米尔语、越南语等。。还有一些问题 你好
云你能帮帮我吗?
这个代码怎么了?
;Reference, post#138
;https://stackoverflow.com/questions/643694/what-is-the-difference-between-utf-8-and-unicode
(defun UTF8->unicode ( l / ls 8b d2 foo) ; encode UTF-8 to unicode
;;;hanhphuc 17.04.2018
(setq 8b '((s) (while (< (strlen s)(setq s (strcat "0" s))) s)
d2 '((str) ;split string to two list
(if (> (strlen str) 0)
(cons (substr str 1(d2 (setq str (substr str 9 ))))
)
)
foo '(($ / pos i) ; base2 to decimal
(setq i 0)
(+ (cond ((while (and (> (strlen $) 0) (setq pos (vl-string-search "1" $)))
(setq $ (substr $ (+ 2 pos))
i (+ i (expt 2 (strlen $)))
)
)
)
(0)
)
(atoi $)
)
)
ls (mapcar ''((x / $)
(setq $ (LM:dec->base (foo x) 16))
(if
(= (strlen $) 1)
(strcat "0" $)
$
)
)
(d2
(apply 'strcat
(mapcar ''((a x) (substr (8b a) (- 9 x) x))
l
(cdr (assoc (length l) '((1 . (7)) (2 . (5 6)) (3 . (4 6 6)) (4 . (3 6 6 6)))))
)
)
)
)
)
(apply 'strcat
(vl-list* "\\U"
(if (> (length ls) 1)
"+"
"+00")
ls
)
)
)
(defun U8:bytes (l / x ls)
;hanhphuc 17.04.2018
;UTF-8 split the bytes
(setq x (car l))
(if l
(cons (vl-remove nil (cond ((<= 0 x 191)
(setq ls (list x)
l(cdr l)
)
ls
)
((<= 192 x 223)
(setq ls (list x (cadr l))
l(cddr l)
)
ls
)
((<= 224 x 239)
(setq ls (list x (cadr l) (caddr l))
l(cdddr l)
)
ls
)
((<= 240 x 247)
(setq ls (list x (cadr l) (caddr l) (cadddr l))
l(cddddr l)
)
ls
)
)
)
(U8:bytes l)
)
)
)
所以换成这个
(setq ret "Lee Mac & Marko Ribar\r\nHappy Birthday\r\nç¥ä½*们生日快ä¹\r\n幸ç¦\r\nChúc mừng sinh nháº*t\r\n"
)
愉快地编码
p/s:使用read char读取unicode文件 非常感谢穆什·汉赫普克 不客气
对不起,参考链接中有#138的错别字,应该是#147的帖子
页:
[1]