python中文乱码

<< 使用python提取mdx中的数据 | Home | CentOS5.8安装配置PPPoE服务器以及问题总结 >>

python中文乱码

我们使用Python时，常常会用到交互命令raw_input，但是如果输入中文，没有经过编码处理，有时候程序就不是你想要的结果。

# -*- coding: utf-8 -*-
import os

print u"这里是来自LC\'的问候。" # 这里是来自LC'的问候。

print '=' * 10 # ==========
print u'这将直接执行' + os.getcwd() # 这将直接执行C:\Python27

print "直接打印Unicode" # 鐩存帴鎵撳嵃Unicode
 
print u"直接打印Unicode" # 直接打印Unicode
print u"Unicode转换成GB18030".encode('gb18030') # Unicode转换成GB18030
print "UTF-8中文转换到GB18030, 然后再打印".decode("utf-8").encode('gb18030') # UTF-8中文转换到GB18030, 然后再打印

while  True:
 import sys, locale
 message1 = raw_input(u'提问1:'.encode('gb18030')).decode(sys.stdin.encoding or locale.getpreferredencoding(True))
 if message1 == u"你好":
  print message1
 else:
  print u"我不知道你在说什么"

 message2 = raw_input(u'提问2>'.encode('gb18030'))
 print message2
 if message2 == u"你好":
  print message2
 else:
  print u"我不知道你在说什么"
 #import chardet
 #print chardet.detect(message)

如何把raw_input输入的字符转成utf-8编码格式？

Python中可以使用decode和encode两个方法。先decode把str转成Unicode格式，然后encode把Unicode编成要求的字符串。

decode用法：str -> decode('the_coding_of_str') -> unicode

encode用法：unicode -> encode('the_coding_you_want') -> str

字符串是Unicode经过编码后的字节组成。decode时需要知道输入的编码格式，如果格式不对python会抛出错误

C:\Python27>python bianmaceshi.py
这里是来自LC'的问候。
==========
这将直接执行C:\Python27
鐩存帴鎵撳嵃Unicode  -- 直接打印出来就是这样的乱码，使用后面三种格式。就对了
直接打印Unicode
Unicode转换成GB18030
UTF-8中文转换到GB18030, 然后再打印
提问1:你好
你好
提问2>你好
你好
bianmaceshi.py:26: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being un
equal
  if message2 == u"浣犲ソ":
我不知道你在说什么  -- 这里是出错的地方，预期结果是你好，要改成提问1中的格式，就对了
提问1:呃
呃
我不知道你在说什么
提问2>呃
呃
我不知道你在说什么
提问1:

Tags:

Python

Friday, May 13, 2016 | Python

文章评论

No comments posted yet.

发表评论

标题*: 给个方向吧.
姓名 *: 怎么称呼您？
Email
网站地址
评论内容 *: 写上些您的评论吧.; Remember Me?

Please add 8 and 4 and type the answer here:

Enter the code shown above:

LC's Blog

随记

python中文乱码

文章评论

发表评论

标签云

数据归档

日志分类

最近评论

最新博客