Archive for the ‘MegaLab’ Category

iSpot / Megalab server moves

Thursday, February 11th, 2010

The recently configured back-up architecture and plan were given a chance to be tested over the past couple of weeks with a live swap over for iSpot and the Evolution Megalab. This was brought about by a disk failure on the live web server. The switch over included moving from the live web and database/file server to two completely different back-up web and database/file servers which then assumed the role of live servers whilst maintenance was performed on the original live web server.

The switch to the back up servers went fairly smoothly and with only a minimal amount of  downtime of the live websites. With the exception of the quiz on Megalab (which is something that needs to be looked at), both sites continued to function as normal on the back up architecture.

The disc failure problem on the live web server was rectified after about two weeks which then allowed the site to revert to the original live web server and database/file server set up. A procedure to do these switches had been documented and followed. We had originally believed that the the swap back to the live set-up had gone very smoothly, and generally speaking in terms of the technical tasks performed it had gone smoothly. However there were a couple of issues which resulted in some significant down time of the live sites:

  1. When the live site files were copied back to the live web servers from the ‘backup live’ servers the websites were put in to maintenance mode so that the public could not access them and no data was written to the databases. A flag is set in the database to put the sites in to maintenance mode. After the files and databases had been copied back on to the live servers, the DNS records were changed so that requests to iSpot and the Megalab were then directed to the original live servers. The flag on the databases on the live servers were set to be online so that as the DNS changes propagated to the DNS servers of various ISPs, users trying to access the live sites would then be able to access them on the live servers (this can generally take between 24 and 72 hours or so and is out of our hands). The oversight on our part that caused a problem was that there is a scheduled task (as part of the original back up routine) to back up the live database and save/overwrite the copy on the back up server. This task performed its duty and overwrote the database on the back up server from live but now with the offline/online flag set to online. This happened within an hour of switching back to the live set up and before the DNS propagation had taken effect so that requests for iSpot were generally still pointing to the back up server which had now (inadvertently) been set to online, so users began posting observations and we believed that we were seeing the site appearing on the live servers. Then at hourly intervals the database on the back up server (which was being updated with user observations etc) was over-written with the database from the live site – which was effectively old data. The issue was picked up when a new news feature item dissappeared from the home page. The scheduled task was then stopped and the flag was set to make the sites offline again on the back up servers. Once the propagation of the DNS records started to take effect, users were successfully directed to using the sites on the live servers again.
  2. The second problem was that we had requested internally that the external hosting company be notified to update the DNS settings of iSpot and the Megalab and our request was not fulfilled. When we chased up about the request we were told that it was the responsibility of a different person so we then had to chase up the request with a different person to get it actioned. We weren’t given any notice that it wasn’t the responsibility of the person that we logged the request with.

Lessons learnt

  • Apart from the obvious frustrastion of a period of a couple of hours where data was lost, we were generally very pleased with how the back up architecture performed and how smoothly the transition from each system went.
  • We have updated our documentation to reflect any changes to the procedure to make sure that data isn’t lost in the future.
  • During the switch back to live it came to light that there may be an alternative and more efficient way of switching between servers that will result in less downtime of the site for users. Instead of asking the external hosting company to switch the destination of the request to iSpot and the Megalab on their DNS servers and waiting for these changes to propagate, there is a tool that we can use that can internally (internal to our network) control the routing to various servers/IP addresses for given requests. This would effectively mimic a change to an external DNS record but as there would be no external propagation required the switch between servers would be almost instantaneous which should therefore result in only a minimal amount of down time for the live servers.

MegaLab offical data gathering finish

Friday, November 6th, 2009

The MegaLab officially stopped gathering data as of 31st October. This does not mean that the site is no longer available, in fact it’s business as usual for the Evolution MegaLab but it does mean that analysis has now begun on the data that has been collected to date. So to help with this I’ve added two new fields to the full data download that’s available to privileged users, these are: an id field that gives a unique reference to every line of the download, and the date the record was published – this is different from the date the record was made.

It’ll be interesting to hear the outcomes of the data analysis!

New downloads page

Thursday, July 9th, 2009

I spent a couple of days last week developing a new interface to replace the existing one for the downloading of records from the Evolution MegaLab.

It was an oversight on my part when I’d initially built the site for the downloads page not to scale well in the face of thousands of records added. So now I’ve brought it in line with the rest of the site, with the embeded Google map now using clustering as with the other maps on the site.

The new interface now allows for any cluster to be selected, or individual pins still to be selected, if zoomed in enough, and then the associated records downloaded. As well as this I’ve now moved the generation of the full report to the hourly cron job, which means that the full report may be as much as 1 hour behind the actual records, but the download will now be almost instantaneous.

MegaLab update

Friday, May 1st, 2009

With the number of records submitted by users growing all the time, a major memory problem came to light with the cron job that had been scheduled to run hourly. This culminated in the server becoming unresponsive and to the site becoming inaccessible. I had previously found this problem when uploading large amounts of data to the historic part of the site (see Historic records uploaded), this meant I knew what I had to do to fix it, although it still took some time to track down the exact parts of code that needed changing.

The cron job has now been reinstated for about a week and the memory problem looks to have been resolved. I think updating the Symfony framework would also have cured the problem, although I was reluctant to do this incase I introduced other bugs into the live site.

MegaLab changes and bugs

Friday, March 20th, 2009

I’ve been steadily working through the changes that Jenny has been listing of the site, most of these are small text changes and localised content for the foreign collaborators, so hopefully the site content is close to final.

After the last development phase that I carried out there’s been a steady flow of bugs, which I suppose is expected, although still annoying. I think these can be categorised into the following major areas:

Internationalisation

Bugs relating to the change from using just language for the translated versions, to culture i.e. the use of a user’s country as well as language. I knew this was going to be a major change and I haven’t been supprised by the various issues that have cropped up.

Linux

The move from a Windows server to Linux has been the cause of several annoying bugs, which seem to keep cropping up. I think the main thing as been file permissions not allowing the site to create new images. I’m hopeful that these bugs have now been sorted, although I’m keeping a close eye on the server.

Email

I think there’s been a change in OU policy, or something relating to sending emails, which has caused problems with users trying to register on the site. Although the site can now send emails, the address it’s using is an OU domain and not the website domain, this seems that it might be causing some mail servers to bounce the emails because the the from address does not match the sending domain. I’ve yet to deal with this… it’s on my list.

As well as this we moved the hosting of the site to be handled through the OU. This had the knock on effect of adding another layer for me to go through  when doing things like setting up email forwarding from the site domain, which again has lead to email addresses not working.

MegaLab bugs disappearing ready for the snails

Thursday, March 5th, 2009

I’m hopeful that the majority of the bugs with the site have now been ironed out, although time will tell. Most of these centred around the implementation of the culture code in place of just a language code.

As well as fixing the bugs I’ve addressed the inconsistency there was with the pie charts not always having their colours in the same order. In doing this I also found and fixed a couple of bugs that were potentially mis-drawing the pie charts in some situations.

Also, Mike will be pleased to learn that accuracy of the calculation of distance between historic and current locations has been improved and now takes into account the convergence of the lines of longitude on the Earth.

One thing that hasn’t been done yet is the batch adding of data that is post 2000 that contributers have supplied. I’m not sure how important this is, so I’ll have to get clarification on this first.

Latest MegaLab version due to go live tomorrow

Monday, February 16th, 2009

We’re planning to make the latest version of the site live tomorrow morning. This will mean the site going off-line for a period, but I’m hoping it shouldn’t be down for long. I’ll be spending today carrying out a dry run on the development server and noting the necessary steps in transforming the current data into the format needed. I’ll also be doing a test upload of all the historic data to make sure this will run smoothly on the live site tomorrow.

I’m leaving the ‘beta’ tag on the site for the time being, but plan to remove this, all things being well, before the main launch at the end of March.

A week of tidying up loose ends

Friday, January 30th, 2009

I’ve managed to get quite a lot of jobs out of the way this week that have been hanging around for a while. These include moving the domain registrations for the Evolution MegaLab over to the OU, sorting out the Google analytics account to give Jenny access and sorting through many emails that I’d left to one side.

I finished the bulk of the MegaLab work last week, but on Monday was adding a bit more info about any historic record found close to a current record, namely adding the latitude & longitude. In the process I think I found the problem that Jonathan thought there was with the picking of the closest histoic record, so that should now be fixed.

I’ve also managed to continue on with iSpot development and have pretty much finished the add an observation block for the front page. I plan to update the live site at the same time as moving it to the new web server, which may be today if I can get to the bottom of the current problem.

Next week I plan to implement to changes to the determination mechanisim for the observations.

Evolution MegaLab changes completed

Monday, January 26th, 2009

After a full two weeks work all the changes that I’m able to make have been made. There are still a few things that still need to be added, but these are all related to additional content that have yet to be created. As yet the changes only appear on the development site, but when Jenny gives me the additional content and gives me the go ahead then I’ll go ahead and implement these on the live site.

I haven’t taken the beta label off yet, I think it’s best to do that after a period of users being allowed to use the site with the changes applied, so I can iron out any possible bugs.

For those interested and able to access it the development site is at: evolutionmegalab-dev.open.ac.uk/

Internationalisation and localisation of MegaLab

Monday, January 19th, 2009

It’s been rather a lot of work to allow for the site to cater for country specific content as well as just offering a translated version. Jenny and I have now settled on changing the flags to mean countries instead of languages and for the countries with more than one language the user is then presented with a selection box to choose the language. Symfony handles culture (language & country) well, but I would have been better to have set this up from the beginning as opposed to using a cut down model of just language, oh well I’ll know next time.