blog

HTML theme parser - First version released

APPSEED

First version is up & running - Yupiii

Hello, audience, 

Adi here, from APPSEED team. I have some good news and some bad news. Because I'm a positive guy, I will start with the good news. We have a running HTML parser written in python / bs4. At very this moment (2018.nov.01) I succeed to parse & process more than 10 HTML themes and the results are:

- the average integration time for an external HTML theme is around 20min. Integration means: isolate the dependencies (js files, CSS, images), generate the master page from index.html, extract all sections, extract hard coded texts. 

The result is quite #cool because HTML themes were bought from ThemeForest and the input comes with a non-standard structure.

Now, few words about out tool architecture:

- it's an interactive console 

- we can mutate the HTML tree structure

- we can extract sections, delete nodes, edit nodes, edit resources paths

- into HTML files some markers are injected in order to translate that HTML in any render engine we want: CodeIgniter native, Blade (Laravel), Jinja2 (Flask) and now we are working to generate the templates to be used in React & Vuue. 


Now the bad news ... :)

It's not easy to write the code. Complexity comes from the fact that HTML files are not homogenous. For instance, we can have sections listed under body node or two levels below. This fact should be solved by the parser automatically, .. and trust me this can be tricky to patch.


Future actions:

- integrate a machine learning module in order to assimilate & solve faster HTML code exceptions

- parse many HTML themes to train the ML module

- expose the tool to developers, to help them deliver faster a new WebApp project.


Cheers!

 Adi - AppSeed.us 

#automation tools & coded apps in Php, Python and Javascript   

  


#automation      #tools      #html-processing