Split file example, Regex

  1. Case Requirements

    The file that needs to be split is an index.html file with HTML, style, and JS code mixed together. The requirement is to split this mixed file into three files, index.html, index.css, and index.js, and put these three files in the same clock folder.


  2. Case steps

    1. Import the required fs and path modules and create regular expressions. Regular expressions are used to split <style> and <script> code in mixed files.

      1
      2
      const regStyle = /<style>[\s\S]*<\/style>/
      const regScript = /<script>[\s\S]*<\/script>/

      Note here that [\s\S] matches everything. \s matches all whitespace, including newlines, \S is not whitespace, excluding newlines. <\/style> is to avoid the effect of escape characters.


    2. Use the fs module to read the html mixed file that needs to be processed.


    3. Customize the method of splitting css code

      1
      2
      3
      4
      5
      6
      7
      8
      9
      function resolveCSS(htmlStr){
      const r1 = regStyle.exec(htmlStr) //Extract the style tag and the code in the tag in the mixed page string
      const newCSS = r1[0].replace('<style>', '').replace('</style>', '') //Remove tags

      fs.writeFile(path.join(__dirname, './clock/index.css'), newCSS, err=>{
      if(err) return console.log(err.message)
      console.log('write in CSS style succeed!')
      })
      }

      The exec() method is used to retrieve a match of a regular expression in a string. The parameter is a string, and an array is returned, which stores the matching results. If no match is found, the return value is null. The 0th element of this array is the text that matches the regular expression. The 1st element is the text (if any) that matches the 1st subexpression of the RegExpObject, and the 2nd element is the text that matches the 2nd subexpression of the RegExpObject ( if any), and so on. RegExpObject has multiple subterms because it can be written in the form const regex1 = RegExp('foo*', 'g').


    4. Customize the method of splitting JS code (same as above)


    5. Customize the method of splitting html code

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      function resolveHTML(htmlStr){
      const newHTML = htmlStr
      .replace(regStyle, '<link rel="stylesheet" href="./index.css"/>')
      .replace(regScript, '<link src="./index.js"></script>')

      fs.writeFile(path.join(__dirname, './clock/index.html'), newHTML, err=>{
      if(err) return console.log(err.message)
      console.log('write in HTML page succeed!')
      })
      }


  1. Precautions

    1. fs.writeFile() can only be used to create a file, not a path.

      That is to say, the case is to manually create the clock folder first. If it is not created, writing to the file will fail.

    2. Repeatedly calling fs.writeFile() to write to the same file, the newly written content will overwrite the previous old content.


Share