PHP as a replacement for sed/awk

I had gotten sick of not being able to do what I wanted to do with sed/awk – partly because I am not too familiar with those – and, was investigating replacements for those. Many people seem to be using perl and since I wasn’t too familiar with Perl as well, I started thinking about using PHP – which I do know. And, that served my needs brilliantly.

This is a simplistic version of my problem & a solution:
I can do the following to get the values for specific params (cid and oid in this case) for requests made, from my access logs:

k7@local:echo "http://example.com?cid=123&oid=435" | php -R 'preg_match("#.*cid=([^&]+)&oid=([^&]+)#",$argn, $matches);if(!empty($matches)) {echo $matches[1], " ", $matches[2], "n";}'
123 435

Regular Expression to check for comma separated list of Integers

//get rid of all kinds of spaces
$IDs = preg_replace('/s+/', '', $inputIDs);
if(!preg_match(''/^\d+$|^(\d+,)+\d+$/'', $IDs)) {
echo "Invalid Input";
}

You could use the validated IDs above in a SQL like the this:

SELECT * FROM t1 WHERE id IN($IDs)

Disclaimer: I am not going to be responsible for any kind of SQL injection resulting from using the snippet above.
updated 20-Mar-2013

PHP, MongoDB, findAndModify

findAndModify is not supported by the current version of the PHP MongoDB driver, so I had to use the following code

$feed = $db->command(array('findAndModify' => 'feedList', 'query' => array('$where' => new MongoCode('
                                 function(){ 
                                      return ((this.inProgress == false)  
                                          || (currentTime >= (this.lastUpdated + this.updateFrequency))));
                                     }', array('currentTime' => $start))), 'update' => array('$set' => array('inProgress' => true))));

Where:

  • $db = Database handle
  • ‘feedList’ = Name of the Collection(Table)
  • inProgress, currentTime,updateFrequency and lastUpdated are some keys
  • $start = a MongoDate Obj

Essentially I am locking a document(set ‘inProgress’ to true) by finding and updating the first document that that is not locked by another process (‘inProgress’ set to false and a few other conditions)

Notice the use of ‘this’ in the MongoCode Obj. I think it refers to the current Document that is being processed.

Also be aware that findAndModify only finds and modifies the first document found.

Munin: Could not open required defaults file: /root/.my.cnf

For some reason I kept getting this error (even when the file was readable by all) in one of the munin (mysql slave) plugins that I had written.

So this was the solution that worked for me;

under the [mysql*] section in /etc/munin/plugin-conf.d/munin-node 
I added this: 
env.mysqlopts -u{user} -p{password}

and in my plugin

I did something like this:

mysql ${mysqlopts} -e'show slave status G'

and that seemed to have picked up the user and password from env.mysqlopts and the plugin started working (ofcourse after I restarted munin).

MySQL KEY, PRIMARY KEY and NOT NULL.

Have a look at this sql statement:

mysql> CREATE TABLE `t1` (`id` INT NULL AUTO_INCREMENT, KEY(`id`));
Query OK, 0 rows affected (0.92 sec)
 
mysql> SHOW CREATE TABLE `t1`;
+-------+-------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                            |
+-------+-------------------------------------------------------------------------------------------------------------------------+
| t1    | CREATE TABLE `t1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
   KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
+-------+-------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.08 sec)

See, how MySQL implicitly converts the NULLable column `id` to NOT NULLable because the column `id` was also a key?

Also have a look at the following statements:

mysql> CREATE TABLE `t2` (`id` INT NULL AUTO_INCREMENT KEY);
Query OK, 0 rows affected (0.11 sec)
 
mysql> SHOW CREATE TABLE `t2`;
+-------+-------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                            |
+-------+-------------------------------------------------------------------------------------------------------------------------+
| t2    | CREATE TABLE `t2` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
+-------+-------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Having KEY in the column specification makes that column a PRIMARY KEY!

I am going to go through these pages:
http://dev.mysql.com/doc/refman/5.0/en/silent-column-changes.html
http://dev.mysql.com/doc/refman/5.0/en/create-table.html

and try to understand why the context really matters while specifying the key. and also the reasoning behind implicitly changing stuff instead of throwing an error.ATM, it just doesn’t make any sense!

Wireless connection dropouts in Lucid Lynx

Since my upgrade to Lucid, I have been experiencing dropouts on my wireless connection and since the solutions that I had found on the net did not solve the above problem, I decide to script a temporary fix until it gets fixed in (K)Ubuntu or in the Kernel.

I found that that after I get disconnected if I were to disable wireless or sometimes disconnect and then try to (enable wireless &) connect again I am connected for a random amount of time. I automated this by writing a bash script to check for connectivity to the net and try to reconnect if there is no connection.

This has worked well for me so far and if your are experiencing similar issues, give it a try. The script is available here.

Update: Looks like the wireless dropout issue has been been fixed in Lucid 🙂

Lucid Lynx — Downgrading from PHP 5.3

I hit this bug after the upgrade to PHP 5.3 (which happened with my Kubuntu Upgrade to Lucid Lynx). I read in another bug report that this issue was fixed in the latest snaphshot, so I tried compiling from source – using the latest 5.3 snapshot. This didn’t go well, as I hit a seg fault after installation and after about an hour of playing with gdb, I gave up!
So, I had to downgrade to PHP 5.2 ;(. Now lucid repos have 5.3 and I had no idea what to do. Googled around and came across a solution here. That worked. I just had to make sure that each individual php5 package (php5-dev, php5-memcache, php5-curl, etc,.) was listed in the file like this;

Package: php5
Pin: release a=karmic
Pin-Priority: 991

Package: php5-dev
Pin: release a=karmic
Pin-Priority: 991

Package: php5-curl
Pin: release a=karmic
Pin-Priority: 991

and so on(try to script the generation of the file).

I was very glad that the designers of apt had made this easy for us! I knew yum had priorities but, I wasn’t aware of a similar solution for apt. Everyday you learn something new 😉

Hacker News Sorter

I have just recently created a bookmarklet to sort the news stories on the front page of HN.

There are a  few minor issues that I have noticed but, should work for most parts.

When you click on the bookmarklet, two links get created at the top as shown in the screenshot below:

front_page_sort

Clicking on ‘Comments’ will sort the posts by the number of comments that they have and Clicking on ‘Points’ will sort the stories by the number of points associated with the stories. Clicking on the links more than once will toggle the sort direction.

The code is available on GitHub(Do whatever you want license) here.

I have used jQuery and a cookie plugin as it saved me a lot of time.

So here is the link to the bookmarklet: HN Sorter

InnoDB + MySQL Implicit Autocommit

Today, I spent about an hour debugging a part of the script that I was writing. This is what I faced:
I had a function writing some logs into a table. The function was pretty simple as all it had to do was insert some data into a row (using the data passed to it). The problem was that the data wasn’t getting inserted. I tried hard-coding the data into the sql statement(originally I was using Zend_Db_Pdo_Mysql::insert() method and it was returning 1 after an insert) and that wasn’t helping either.

I asked my colleague to have a look at the code to see if he could detect any issues. We were trying different things and then after a while he asked if there were any transactions involved. Since I was using the InnoDB engine for all my tables, this made me to start thinking. I had disabled autcommit for a series of transactions (that involved a SELECT … FOR UPATE statement – hoping to acheive row level locking). Since, I was using the same Db object within the function that had the above statement and in the function that was doing the logging afterwards, we realised there could be a problem there. On closer inspection I found that I had forgot to commit the transaction that involved the SELECT … FOR UPDATE statement and since the logging function was using the same Db object afterwards the logging became part of the transaction that never got committed!

But, it turned out that when I modified the sql statement in the logging function to something like ‘DROP TABLE audit’, that actually dropped the table even-though I didn’t explicitly commit (after disabling autocommit). I was intrigued by this behaviour and found that MySQL + InnnoDB implicitly commit after certain statements(http://dev.mysql.com/doc/refman/5.0/en/implicit-commit.html).

I also found out PostsgreSQL(and a few others) exhibit a different behaviour (and IMO better). You can pretty much ROLLBACK after all statemnts that do not involve dropping the db or a tablespace
(http://wiki.postgresql.org/wiki/Transactional_DDL_in_PostgreSQL:_A_Competitive_Analysis).

I guess the moral of the story is: “BEGINning without COMMITing to it can only be a waste of time” ;).