How can I tell how long a PHP script can run + how to extend - CSS-Tricks

This topic is empty.

Viewing 6 posts - 1 through 6 (of 6 total)

Author

Posts
October 29, 2009 at 5:43 pm #26600

mattvot
Member

I’ve made this PHP file that could potentially take days to complete.

Initial tests show it stops working after some time.

I would like to know what the time limit is for PHP scripts on the server.

Plus, Does anybody know a why to up the limit? It doesn’t necessary have to be a straight forward config update.

Thanks
Matt

October 30, 2009 at 8:24 am #66051

Argeaux
Participant

check this out for more info:

http://php.net/manual/en/function.set-time-limit.php

But why on earth do you have a script that takes days to complete? Maybe there is a better way if you explain what you are trying to do…

October 30, 2009 at 8:38 am #66053

mattvot
Member

Haha,

Well, My client struck a deal with a massive website and is allowed to be a re distributor of all their data. At first the XML feed was too big to process, so I got them to separate them into countries. Now I was able to get my script to do one feed at a time to reduce the times.

Lets take the USA feed alone for example, there are over 60,000 items, and PHP could import them at about 6,000/hour. But then…

We needed to create local tumbnails for the images that come along with the feed. That takes a HELL OF ALOT longer, ALOT LONGER. like from 6,000/hour to about 600/hour.

So PHP times out and it’s frustrating because the alternative is to do all this processing on my laptop and then FTP the thumbnails up, but this will place my laptop out of action for who knows how long.

I’ve looked at set_time_limit() before and got no where, but I did see this this time round:

Quote:

Note: The set_time_limit() function and the configuration directive max_execution_time only affect the execution time of the script itself. Any time spent on activity that happens outside the execution of the script such as system calls using system(), stream operations, database queries, etc. is not included when determining the maximum time that the script has been running. This is not true on Windows where the measured time is real.

I am ‘includes’ function files and using a MySQL database so that probably messes with it, but I’m not sure how to resolve it.

October 30, 2009 at 10:58 am #66057

Argeaux
Participant

Mhh i haven’t handled such large xml files before..

Maybe the way to do it is in chunks of (x)amount.
Then keep track of where you are in the xml file in a database.

You make a cronjob which calls the php file every 5 minutes to do (x)amount of lines and then save on what line it stops so the next cronjob can start there. You will have to find out how many lines can be done in 5 minutes by the script to fill in the (x)amount.

I am not sure how to do this, ti am just thinking out loud and maybe it helps you.
It’s probably a smart idea to find a forum dedicated to php for this, because its a though question.

November 4, 2009 at 3:52 pm #66244

davesgonebananas
Member

I agree that the way in which you are going about this is not the most efficient approach.

Firstly, 6000 transactions/hour = 100 transactions/min = 1.66 transactions/sec = a very slow script! It’s possible that it’s just necessarily slow, that PHP isn’t the optimal scripting language or that it’s just running on an overloaded web server.

This brings me to my first point – don’t run the script through a website, which can timeout, etc. The webserver will limit the amount of available ram and other resources the script can use. Instead, run the script directly from the commandline using php myscript.php. This will be much more efficient although if it will still be difficult to manage if it’s going to take several days.

If this is quite a processor/memory intensive operation (and it sounds like it is) consider using Amazon EC2. With Amazon EC2 you can get a virtual private server up and running in minutes, and you are only charged by the hour for use. It’s really cheap (and you can of course pass any costs on to your client).

Something else worth looking into would be Amazon Simple Queue Service. It’s designed for this sort of scenario where you have a lot of processing to do in batch. It’s sort of a similar idea to breaking into chunks and using the cron, only more reliable.

Firstly, you will need to create your queue:

Code:

Read the xml feed.
For each item in the feed
[Optional: Upload the item to Amazon S3]
Create a message on the queue that contains the item [or a link to the item]

Then in the processing servers:

Code:

Read a message from the queue
Extract the item from the message
[Optional: Download the item from Amazon S3 if stored there]
Process the item – do whatever you need to do basically
Generate the thumbnail
Rinse and repeat

The benefit of this strategy is that you can have as many processing servers reading the queue as you like. So if you think it’s going to take 72 hours on one server just create 7 servers and it should be all done in under 2 hours.

http://aws.amazon.com/ec2/
http://aws.amazon.com/sqs

November 7, 2009 at 9:04 am #66353

mattvot
Member

Thanks both of you.

Yeh, well the first run through of the script will take forever, but it is coded in such a way as when it realises it has parsed the item it is looking at already the script stops that feed and turns to the next one.
Author

Posts

Viewing 6 posts - 1 through 6 (of 6 total)

The forum ‘Back End’ is closed to new topics and replies.