BioNuc Technical: Memory Leakage while using Mechanize

I was working on a task that scrape several web pages. After running this task for a while, i found that memory taken by my process is raising forever until it was about to eat all memory available of the server.

after some investigation regarding this matter, i knew that the problem was in my understanding to how mechanize agent works

let me explain with an example

agent = WWW::Mechanize.new
while(true)
page = agent.get(“www.example.com”)
end

in this example, memory will be consumed because mechanize keeps history within the agent, i looked in its documentation and found that there is a parameter which is called “max_history” which when set will fix this issue i think but didn’t try

also a fix to such issue, if you don’t need history is to write your code like that

while(true)
agent = WWW::Mechanize.new
page = agent.get(“www.example.com”)
end

That’s it, maybe this piece of information can be useful for someone facing this issue just like me

BioNuc Technical

Monday, June 29, 2009

Memory Leakage while using Mechanize

No comments:

About Me

Blog Archive