Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build scraper for blog.ejs #72

Closed
wants to merge 4 commits into from

Conversation

heytulsiprasad
Copy link
Contributor

@heytulsiprasad heytulsiprasad commented Dec 16, 2019

Description

This PR is WIP (work in progress), focused to solve #68 . The scraper.js built within router directory contains the code to fetch data from zairza.blog.in. It fetches title, author, release-date, and cover-img-link of all the blog posts and stores them as objects.

Dependencies Added

  • request
  • cheerio

Problems still facing

While writing the derived objects into blogs.json unexpected tokens arising out of nowhere (like {} and []). I suspect its reason has something to do with the way node writes files using fs module. Still, I can be wrong. Please check this below code block, which does that part.

const newBlog = new Blog(title, author, release, cover)
console.log(newBlog)

// TODO: write blog objects into blogs.json file
fs.readFile("blogs.json", function (err, data) {
    if (!err) {
        var json = JSON.parse(data)
        json.push(newBlog)

        fs.writeFile("blogs.json", JSON.stringify(json, null, 4), function (err) {
                if (!err) {
                       console.log("wrote one object")
                } else {
                       console.log(err)
                }
        })
    } else {
          console.log(err)
    }
})

Work remaining todo:

  • Build: scraper to fetch contents from zairza.blog.in and are stored as objects
  • Fix: above mentioned issue
  • Make this code accessible through the app.js file
  • Create a template in blog.ejs for rendering the blog sections
  • Run a forEach method on blogs.json at the end to fetch scraped data to our website

Hope this brings clarity to the issue #68 . All suggestions are appreciated. 👍

@heytulsiprasad
Copy link
Contributor Author

heytulsiprasad commented Dec 18, 2019

Debugging Stats

  • Aiming to just overwrite the file.json file each time the while loop executes.
  • As per the console.log(newBlog) command the output is like the following:

object

  • Later when this block executes the object is expected to be written in json file.
                    fs.writeFile('file.json', JSON.stringify(newBlog, null, 4), (err) => {
                        if (err) throw err;
                    });
  • However the actual value written in file.json is this:

output

  • The object printed is alright, we don't need the extra gibberish-strange strings and tokens written in json file.

@ankitjena
Copy link
Collaborator

@tulsi-prasad it's better if you work on a separate branch. Do not work on master.

@ankitjena
Copy link
Collaborator

@tulsi-prasad you don't need to read the file and write it everytime. Instead append it in a array and write to the file at the last

@heytulsiprasad
Copy link
Contributor Author

@tulsi-prasad you don't need to read the file and write it everytime. Instead append it in a array and write to the file at the last

This can be done. I'll put that in the next commit.

Shall I close this PR and make another from a different branch or just make a new one?

@ankitjena
Copy link
Collaborator

@tulsi-prasad It's alright for now, keep this in mind from next time.

@heytulsiprasad
Copy link
Contributor Author

Closing this PR due to local branch conflicts. Will make one asap from a different branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants