Thursday, September 30, 2010

Ram Janma Bhumi Ayodhya verdict

The Ram Janma Bhumi verdict is out.
Allahabad High court has given the verdict.

Summary of the result while the detailed results
Khan : The premisis must be divided into 3 parts. The place where ram was born must be given to Hindus.
Agarwal : Place where ram was born is a place of worship - give to hindus
Sharma : Entire premises is ram janmasthan

Awaiting Ayodhya Verdict Allahabad High Court Website


The Allahabad highcourt website would be hosting the ayodhya verdict results. The site doesnt seem to respond. I managed to get a screenshot of the website. This website has a poor response time.


Tuesday, April 13, 2010

Waiver of Credit Card Late Payment Fees

You have used you credit card instead of your debit card to gain those extra points.
You have been meticulous in paying the credit card bill just before the due date so as the avoid the late payment fees.
Your credit history has been good...
Till one fine day, you miss the credit card minimum due amount payment deadline..
In one shot, you loose all the bonus points that you had gathered till now ..

Late payment of credit card dues can be a costly affair.. And I realised that recently ..

Since I was travelling, I did not remember that I had to pay my credit card bills ..
And BAM! .. Rs 700 of late payment fees on my HDFC credit card .. Rs 300 on my citibank credit card ..

Not to worry .. I sent a mail to the customer care of both the credit card divisions, requesting them the reverse the late payment fees as well as the interest fees and I was lucky enough to get the waiver.. The amount will now be credited back to my card ..

Here is the mail that I had written :

Date: Wednesday, April 07, 2010 12:20 PM
To: India Service (indiaservice@citi.com)
Subject: Late Payment. Credit card ending 2007
Dear Sir / Madam,
This month, I did not remember to pay my Citibank payment by the payment due date. I remembered this today and made a payment immediately. Since this is the first time I have made this mistake in 3 years, is it possible for you to kindly relieve me of the late payment fees ?

I would definitely appreciate a favorable response from your end.

Thanking you,
Vijay

Response :

Credit Card blah blah

We wish to inform you that, with a commitment towards greater customer satisfaction we have reversed the Late payment Charges of Rs. 700.00. The corresponding Service Tax will be automatically reversed. The above reversal will reflect in your next monthly statement.

Similar was the case with HDFC bank credit card division too.

It helps if you create a reminder on your mobile phone / outlook as soon as you recieve the credit card bill so that you dont miss the payment..

AND

If it is the first time that you have paid beyond the due date, it is worth giving requesting a waiver a try ..
It is just an e-mail after all ..

Tuesday, June 23, 2009

Immigration to foreign countries - some numbers

Leverage the Skills cost arbitrage - Immigrate.
Here are a few notes that I made. Please add to this if you have any other information.

- Improve your SW skills
Browse through the jobs, and look at the common things that are being asked for.
Keep learning the things. Try to apply them in freelance projects.

Applying for US :

find H1B sponsors.
http://corp-corp.com/h1bonlinejobfair_js.htm?gclid=COO7xd2ot5YCFRNPegodi0IZKQ

Foreign VISA consultants :
http://www.visahouse.net/career_guide.php


Online Job Search Sites
Naukri.

http://www.iitjobs.com
http://www.simplyhired.com

Salaries from payscale (software developer):
UAE 1,20,000 AED - 15 lakh INR
Singapore 38000 SGD - 12 lakh INR
Australia 53000 AUD - 17 lakh INR
Canada 75000 CAD - 30 lakh INR
United K 27000 GBP - 22 lakh INR
USA 64000 USD - 31 lakh INR
Switzerland 89000 CHF - 37 lakh INR
Ireland - 34500 EUR - 23 lakh INR
Denmark - 63000 USD - 30 lakh INR

HSMP program needs you to have 2 lakhs in your account

The fees are 350 GBP - 25, 000 INR

--
CANADA:
Canada - Federal Skilled Worker Program
greater than 67 points in the qualifying test - done
At least one year of experience in one of the NOC occupations list:
I come under : 0213 Computer and information system managers :
Software Engineers and Designers (2173)

Immigration Blog :
http://www.canadavisa.com/canada-immigration-blog/

Federal Skilled Worker Program : (New Instructions )
http://www.canadavisa.com/new-instructions-federal-skilled-worker-applications.html

What all are needed for the VISA :
http://www.canadavisa.com/canadian-immigration-faq-skilled-workers.html

4 months prior to the evaluation

Settlement of Funds :
This is the amount that is required to pass through the VISA process (settlement of funds)
10 601 Canadian dollars = 4.2 lakh INR
this is waived if you have arranged employment in Canada
VISA application fees :
22,500 INR (550 CDN)
http://www.canadavisa.com/federal-skilled-workers-processing-fees.html

Arranged Employment :
http://www.canadavisa.com/fast-track-canada-immigration-visa-application.html

The Employer - Employee - Arranged Employment process :
http://www.movetoedmonton.com/foreign/arranged/

VISA processing time :
http://www.canadavisa.com/federal-skilled-worker-processing-times.html#asiaandpacific
Seems to take 72 months. Isnt that too long!
This 72 months is for the Permanent Residentship VISA. You get the Federal Skilled Worker program VISA earlier.

Thursday, June 04, 2009

Pentaho Data Integration - Scalable ETL deployments

Q&A notes from the following webinar for the benefit of PDI users.





Session number:  713773880

Ranadeep Bhattacharya - 11:48 pm

Q: What do you mean by read in parallel? Does that mean only a part of the file is available in each slave?

Matt Casters - 11:49 pm

A: That's exactly what the algorithm does.  It splits the file by size and divides data ranges over the available nodes.

_________________________________________________________________



Ranadeep Bhattacharya - 11:50 pm

Q: But is the file physically located on a single server or split between the 10 or 20?

Matt Casters - 11:50 pm

A: Located on a single shared filesystem.  So the same file is read by N nodes. 

_________________________________________________________________



abhishek manocha - 11:50 pm

Q: So as i understand is their a limitation of clustering only possible if we choose CSV as our input step? I doesnt work on Table Input step?

Matt Casters - 11:52 pm

A: You can do the same thing with a Table Input node, but you need to tweak the SQL statement that is executed since you only want a part of the rows.  Usually it involves using a MOD (%) operator and internal variables representing the node # and # of nodes.

_________________________________________________________________



Robert Folkerts - 11:51 pm

Q: Were there experiments with dimension lookups when populating a fact table?  That is my 'bread and butter' case.

Matt Casters - 11:53 pm

A: Not yet Robert.  With the new cache pre-load option it would make an interesting experiment for sure.

_________________________________________________________________



Dan Jolly - 11:54 pm

Q: When is 3.2 scheduled for GA?

Matt Casters - 11:55 pm

A: Dan, 3.2.0-stable was released last week.

_________________________________________________________________



Vijayaraghavan Amirisetty - 11:56 pm

Q: Does PDI plan to support other partitioning methods like key-range partitioning or hash partitioning in the future ? - Vijay

Matt Casters - 11:58 pm

A: It's not on our roadmap right away.  That being said, it's possible to do now both though Partitioning plugins as well as though a calculation. (simply calculate a partition # and do a mod part on that)

_________________________________________________________________



Peter Schmidt - 11:55 pm

Q: Can you please re-explain the diff between 50/Sort and 100/Sort and 300/Sort.

Matt Casters - 11:56 pm

A: The only difference is the size of the lineitem.tbl file size.  300=1.8B rows, 100=600M rows, etc

_________________________________________________________________



abhishek manocha - 11:57 pm

Q: So considering a scenrio where I have 80 small db inputs and I need to collate them in one central target db, with scehduling of once in a hour (24 times a day) for all sources, clustering make sense?  

Matt Casters - 12:00 am

A: It can make sense if the CPU consumption on your one server is a bottleneck.  If that's not the case, you don't really need to do it.

_________________________________________________________________



sanjeev sagar - 11:59 pm

Q: i joined late but which benchmark it is?

Lance Walter - 12:01 am

A: The whitepaper on bayon-technologies has more details. It uses TPC-H data, but is not a "benchmark" by design.

_________________________________________________________________



Dan Jolly - 12:01 am

Q: Is this cost model based on the EC2 costs?

Lance Walter - 12:01 am

A: yes, computing as well as storage costs on EC2

_________________________________________________________________



prem brahmandam - 11:54 pm

Q: Can we get a sample transform using "table input" step with tweaked query..

Matt Casters - 12:03 am

A: SELECT * FROM foo WHERE mod(id, ${Internal.Step.Unique.Count}) = ${Internal.Step.Unique.Number}

_________________________________________________________________



Peter Schmidt - 12:02 am

Q: If most of your ETL uses table input/table output steps, what changes need to be made to one's transformations, it sounds like if you were reading from flat files, you wouldn't have to do much to configure this to work?

Matt Casters - 12:05 am

A: Peter, it highly depends on the question if your source database can make use of multiple CPUs, etc.  The best strategy is to partition/shard the source and target databases as well. (see also prem's question above)

_________________________________________________________________



Peter Schmidt - 12:08 am

Q: Quick question on the sample transform query, so I am assuming you'd have to put a wrapper around this that increments the count and number)?

Matt Casters - 12:09 am

A: It goes without saying that those internal variables are set automatically in a clustered run.  

_________________________________________________________________



Bret Landon - 12:08 am

Q: Is any of this based on the hadoop methodology?

Matt Casters - 12:10 am

A: No Hadoop cluster is needed although we have plans to make use of Hadoop clusters in the near future.

_________________________________________________________________



Laura Moche - 12:10 am

Q: Were the EC2 servers from this test case dedicated to this test?  Or were the servers shared with other processing?  

Nicholas Goodman - 12:11 am

A: We dedicated the use for the transformations.  But EC2 instances aren't dedicated - they are shared with other EC2 users...

_________________________________________________________________



Dan Jolly - 11:57 pm

Q: What is that top level number

Nicholas Goodman - 12:12 am

A: sorted 450k / rec / sec for 40 nodes

_________________________________________________________________



abhishek manocha - 12:08 am

Q: No Matt, building on the Peter's question, what if the  source database are really on different machines and partitioning/sharding is not an option as in case of mysql 4?

Nicholas Goodman - 12:13 am

A: There are things that can be done to partition the connection on the PDI side.  ie - if you know that host xyz keeps partition 1, and abc keeps partition 2 you can set that up and we'll use just plain 'ole JDBC

_________________________________________________________________



Venu Ambekar - 12:12 am

Q: Is there a capability to handle only incremental changes from a datasource, instead of depending upon the time-stamps of the tables in the datasource.

Nicholas Goodman - 12:14 am

A: Yes.  PDI has capabilities for detecting changes from data sources and you can help you only process those changes.

_________________________________________________________________



Lakshman Bulusu - 12:14 am

Q: What about EL-T in the CLoud?

Matt Casters - 12:16 am

A: It highly depends on the situation.  It still depends on the capabilities of the database(s) (parallelism etc) used.  Suffice it to say that we always recommend you to make that call yourself in PDI.

_________________________________________________________________



abhishek manocha - 12:16 am

Q: Oh I see Partitioning on the PDI side itself, evn if not supported by underlying db ?

Matt Casters - 12:17 am

A: Yes, we refered to data partitioning in the PDI streams earlier, not just database table partitioning.

_________________________________________________________________



Tony Sidhu - 12:13 am

Q: is it practical to run processing in the cloud, that is inputing and outputing to a database that runs in the office?

Matt Casters - 12:21 am

A: Tony, barring any extreme case, I don't think so.  We have another partner that did something similar but keeping the data on the cloud because of cost savings (30%).  It has to be noted that the machine needed to be hosting reports, analyses, etc.  

_________________________________________________________________



Scott Sorensen - 12:21 am

Q: Could you elaborate on 'capabilities to detect changes in a data source' - or provide a reference on how this is done.

Nicholas Goodman - 12:23 am

A: sure... couple of quesitons on this.  There are a few different ways to approach this - none of which is any PDI silver bullet.  There's a step to compare rows (one stream is reference, other is changed) and output diffs.  You can simply parameterize your

Nicholas Goodman - 12:23 am

A: queries so that they only take "update_dt > {last_time_I_checked}"

_________________________________________________________________



Kamal Trivedi - 12:16 am

Q: what if in future the cloud location is moved overseas

Matt Casters - 12:22 am

A: If your infrastructure cloud is moved then it all depends on data volumes, internet speed, etc whether or not you would get in trouble.  With the internet getting faster all the time, I doubt this will become an issue soon.

_________________________________________________________________



Steve McAtee - 12:18 am

Q: Will a recording of this presentation be available offline?



Matthew Papertsian - 12:23 am

A: Yes - the recording will be avilable within 48 hours and sent to you via email



_________________________________________________________________



abhishek manocha - 12:22 am

Q: What's the role of Carte in all this clustering? I was under the impression thats its internal to PDI, if I going to EC2, where does it fit?

Matt Casters - 12:24 am

A: Carte is simply a small webserver that listens to the outstide world.  It can be given transformations, jobs etc to execute. It's controlled remotely.  It's launched during startup of an EC2 host.

_________________________________________________________________



Dan Jolly - 12:20 am

Q: are there any similar case studies?

Nicholas Goodman - 12:24 am

A: Hi Dan - It'd be great if we could get customers to publish some of their own results and case studies for "big data."  I'm not aware of any case studies that look at CLUSTERING/PARTITIONING explicitly.

Lance Walter - 12:24 am

A: here's the best public example - here's the announcement http://www.pentaho.com/news/releases/20090210_nutricia_uses_pentaho_on_amazon_cloud.php  and here is the technical case study http://tinyurl.com/q9b9hs

Nicholas Goodman - 12:25 am

A: PS - How's the weather in Colorado today?

_________________________________________________________________



abhishek manocha - 12:16 am

Q: Can you give more info on that Nick please... on the incremental data?

Nicholas Goodman - 12:26 am

A: See below... there's nothing really "Big Data" special about change data capture.

Nicholas Goodman - 12:27 am

A: there's a variety of techniques.. check the wiki/pentaho training/dev lists for info on how to do this.

_________________________________________________________________



abhishek manocha - 11:58 pm

Q: It may be obvious question, but in last 10 days I have touched the surface of PDI only and done sample test on single workstation of mine

Matthew Papertsian - 12:27 am

A: Abhishek - can you please complete your question as I am not certain what you are asking?

_________________________________________________________________



sanjeev sagar - 11:59 pm

Q: or which tools were used for these fig.?

Nicholas Goodman - 12:28 am

A: This was PDI 3.2 (pre GA it was inbetween release candidates).  The exact build # is in the whitepaper.

_________________________________________________________________



abhishek manocha - 12:28 am

Q: hi Matt, so you dont really recoment it for office use as you mentioned to tony?

Matt Casters - 12:30 am

A: Well, given the fact that you now have "local" clouds in large corporation and that virtualization keeps growing, I might be completely wrong.  Then again, it was a very specific question that Tony had.

_________________________________________________________________



abhishek manocha - 12:30 am

Q: Ok, fair enough Matt

Matt Casters - 12:31 am

A: Sure thing!

_________________________________________________________________



Ulrich Riedel - 12:30 am

Q: I have seen in the webcast that sorting 1 billion lines a month costs app. 32,000$. Why is this price said to be cheap? Are there comparable prices?

Matt Casters - 12:32 am

A: It's only 4$ to sort a billion rows.  I think the situation was that if you needed to do it every hour or so, you would spend that money.  Cloud works best economically if it takes care of peak loads.

Matt Casters - 12:32 am

A: Sorry 6$ :-)

Thursday, June 05, 2008

Payback (in km) when you go for a second hand bike

I decided to go for a second hand bike as I do not prefer to go on a bike in city traffic.
I intend to use the bike for long distance drives on weekends. As you see in the calculations below,
one recovers the cost of a new pulsar in 24,500 km while a second hand pulsar's price is recovered in 16,000 km. The maintainance cost has been factored in the calculations ( it is twice the amount that a regular bike would have).

btw, I bought this bike from bike galaxy at basappa circle on lalbagh road. There are a lot of second hand bike dealers (agents in fact) near minerva circle and VV puram. The same bike would have cost be 3 to 4 thousand cheaper if I would have bought it from the seller directly. I had to pay 1000 Rs to the agent to buy this bike.
The agent gathers this bike from exchange melas, direct sellers, etc.
How far my calculations are correct?
Only time will tell :)

Monday, June 02, 2008

Bought a bike. Ready to go

The bike, a 2003 pulsar (non DTSI , non alloy wheels, non kickstart) cost be 28,600 (This includes 750 Rs as insurance, 1000 Rs as commission, 450 Rs as registration charges). I bargained with the owner o reduce the price from 29,000 to 26,600.
The odometer reading has been manipulated, so I don’t know how many KM it has run. No idea of the mileage (trying to figure that out now), the RPM meter malfunctions and the idlng RPM has been set to 4,000 (2,200 should be ideal).
Meter has started, and I am in the process of learning the bike internals at present. I intend to use this mike for long distance rides, and the target would be to make a trip to Kolhapur from Bangalore (600 km one way on NH4).

Actually, it is always better to buy a bike directly from the owner as you can save the commission. But you can do this only if you have the ability to gauge the condition of the bike and judge the price of the bike correctly. I feel that I paid a bit more for the bike (I had started with a budge of 25,000!), but since I have already bought it, it is okay.

Here are a few tips that I found useful:

Mouthshut

60kph

btw, I bought the bike from Minerva circle. You will find a lot of shops selling used bikes there. Do check for the Road tax papers and the RC book before buying the bike. It is always good to take along a friend who is knowledgeable about bikes.


Sunday, June 01, 2008

Yahoo Ad "Sense" screwed up


I come to office on a monday morning, log on to Yahoo Mail and notice something
different. The banner ad was in Chinese! (Could be Japanese too). I am trying to understand the logic behind being presented this ad! There are no Chinese e-mails in my mailbox, no Chinese girlfriends. The only thing that is vaguely related to Chinese lying in my inbox is the eBay correspondence about the purchase of a Mobile Phone (CECT make) which is manufactured in china.
Funny!

PS: I use google Adsense on my blog, and most of the times, the textual ad's that are displayed related to the context of the blogpost.