Monday, June 29, 2009

Memory Leakage while using Mechanize

I was working on a task that scrape several web pages. After running this task for a while, i found that memory taken by my process is raising forever until it was about to eat all memory available of the server.

after some investigation regarding this matter, i knew that the problem was in my understanding to how mechanize agent works

let me explain with an example

agent = WWW::Mechanize.new
while(true)
page = agent.get(“www.example.com”)
end

in this example, memory will be consumed because mechanize keeps history within the agent, i looked in its documentation and found that there is a parameter which is called “max_history” which when set will fix this issue i think but didn’t try

also a fix to such issue, if you don’t need history is to write your code like that

while(true)
agent = WWW::Mechanize.new
page = agent.get(“www.example.com”)
end

That’s it, maybe this piece of information can be useful for someone facing this issue just like me

Monday, June 15, 2009

Install Mechanize On Debian

Installing mechanize gem on Debian should be as easy as running this command

gem install mechanize

but this won’t succeed unless you install these packages on your Debian machine

apt-get install libxml-dev libxslt1-dev

Once installed, gem will be installed seamlessly

Update:

if the above apt-get command didn't work with you and you got an error that package doesn't exist

try this new line

apt-get install libxml2-dev libxslt1-dev

as some packages names has changed

[Linux] Mounting windows folders

Due to my new configuration which is using Windows as my default OS and Debian shell through Virtualbox, my need to have a folder shared between these two environment become a must in order to ease file sharing and exchange.

At first, i depended on “shared folders” feature of VirtualBox which after a while failed as i always get a “Protocol Error” whenever i deal with files IO on that shared folder.

That’s why i looked for another solution that is more stable than that one and reached to using “CIFS” as a way to mount windows shared folder on my Debian machine.

The steps are very easy to be made and gives you a stable robust solution away from VirtualBox problems. i can summarize these steps as follows

  1. install on your Debian machine “smbfs” package which will add this new type of mounting called “CIFS”
  2. suppose your machine IP is “192.168.1.6” and you shared a folder on it called “work” and this folder is secured to be used only with certain group which is administrators and one of these administrators is named “BioNuc” and have a certain password then your command will be                                                                                         mount -t cifs -o username=BioNuc ‘\\192.168.1.6\work’ <linux-path-to-mount-on-it>
  3. After writing this line, you will be asked to insert your password in order to make mount process successful

That’s it, Enjoy sharing folders seamlessly