| |

VerySource

 Forgot password?
 Register
Search
View: 912|Reply: 9

UTF-8 storage problem of Chinese ~~ garbled problem

[Copy link]

1

Threads

51

Posts

32.00

Credits

Newbie

Rank: 1

Credits
32.00

 China

Post time: 2020-2-8 23:00:01
| Show all posts |Read mode
DOM4J writes garbled files, and re-encoding does not work. I wrote a test code as follows
import java.io. *;
import org.dom4j. *;
import org.dom4j.io. *;
public class Test {
public static void main (String [] args)
    {
      try {
         SAXReader reader = new SAXReader ();
         Document document = reader.read ("c: /Demo.txt");
         Element root = document.getRootElement ();
        String a = "R & D department";
         Element newElement = root.addElement ("Department")
                  .addAttribute ("value", new String (a.getBytes (), "UTF-8"))
                      ;
        // OutputFormat format = new OutputFormat ("", true, "GBK");
       // Using format can solve the problem. However, XML is specified as UTF-8.
        XMLWriter writer = new XMLWriter (
           new FileOutputStream (new File ("c: /Demo.txt")));
          // new FileWriter ("c: /Demo.txt"));
          // Using FileWriter is not correct, DOM4J is not transcoded, it will report an error when writing the second time
                  writer.write (document);
               writer.close ();
       } catch (Exception e) {System.out.println (e.getMessage ());}

    }
}
Demo.txt
<? xml version = "1.0" encoding = "UTF-8"?>
<Company>

</ Company>
Reply

Use magic Report

0

Threads

3

Posts

3.00

Credits

Newbie

Rank: 1

Credits
3.00

 Invalid IP Address

Post time: 2020-4-10 08:00:01
| Show all posts
You don't need new String (a.getBytes ("GBK"), "UTF-8") a. The default output is utf-8
Reply

Use magic Report

0

Threads

3

Posts

4.00

Credits

Newbie

Rank: 1

Credits
4.00

 China

Post time: 2020-4-10 19:00:01
| Show all posts
Change to gb2312 to try
Reply

Use magic Report

0

Threads

18

Posts

11.00

Credits

Newbie

Rank: 1

Credits
11.00

 China

Post time: 2020-4-11 01:15:01
| Show all posts
GBK
Reply

Use magic Report

1

Threads

51

Posts

32.00

Credits

Newbie

Rank: 1

Credits
32.00

 China

 Author| Post time: 2020-4-15 10:00:01
| Show all posts
Everyone sees the problem ...
Now the requirement is UTF-8
Reply

Use magic Report

1

Threads

21

Posts

19.00

Credits

Newbie

Rank: 1

Credits
19.00

 China

Post time: 2020-4-16 19:45:01
| Show all posts
public static void main (String [] args)
    {
      try {
         SAXReader reader = new SAXReader ();
         Document document = reader.read ("c: /Demo.xml");
         Element root = document.getRootElement ();
        String a = "R & D Department";
        System.out.println (root.getText () + "1");
         Element newElement = root.addElement ("Department")
                  .addAttribute ("value", new String (a.getBytes ("UTF-8"), "UTF-8"))
                      ;
        OutputFormat format = new OutputFormat ("", true, "GBK");
       // Using format can solve the problem, but XML is specified as UTF-8
        XMLWriter writer = new XMLWriter (
           new FileOutputStream (new File ("c: /Demo.xml")));
          // new FileWriter ("c: /Demo.txt"));
          // The use of FileWriter is not correct, DOM4J has not transcoded, and it will report an error when writing the second time
                  writer.write (document);
               writer.close ();
       } catch (Exception e) {System.out.println (e.getMessage ());}

    }
Reply

Use magic Report

1

Threads

51

Posts

32.00

Credits

Newbie

Rank: 1

Credits
32.00

 China

 Author| Post time: 2020-4-17 12:15:01
| Show all posts
Thank you LS
 Element newElement = root.addElement ("Department")
                  .addAttribute ("value", new String (a.getBytes ("UTF-8"), "UTF-8"))
                      ;
For the society, the decoding is UTF-8? a.getBytes ("UTF-8")
Reply

Use magic Report

1

Threads

21

Posts

19.00

Credits

Newbie

Rank: 1

Credits
19.00

 China

Post time: 2020-4-19 11:15:01
| Show all posts
The above writing is actually equivalent to not writing

I just want to tell you that new String (a.getBytes (), "UTF-8"))) caused garbled characters

You first get the unicode encoding of Chinese characters, then treat this encoding as utf-8 and change it back to unicode again. Certainly not work

In fact, <? Xml version = "1.0" encoding = "UTF-8"?> Is not the encoding of xml itself,

It is to tell which code to use other software to interpret

So you do n’t need to worry too much about working with xml files locally
Reply

Use magic Report

1

Threads

51

Posts

32.00

Credits

Newbie

Rank: 1

Credits
32.00

 China

 Author| Post time: 2020-4-20 12:30:02
| Show all posts
Thank you
Reply

Use magic Report

0

Threads

3

Posts

25.00

Credits

Newbie

Rank: 1

Credits
25.00

 Japan

Post time: 2020-5-7 12:55:49
| Show all posts
<? xml version = "1.0" encoding = "UTF-8"?> Just this one should be fine
Reply

Use magic Report

You have to log in before you can reply Login | Register

Points Rules

Contact us|Archive|Mobile|CopyRight © 2008-2023|verysource.com ( 京ICP备17048824号-1 )

Quick Reply To Top Return to the list