urn:lsid:ibm.com:blogs:comments-8efbd6b5-b6cf-4f3f-a40d-2ef96ac2df9dIT Best Kept Secret Is Optimization (Comments)IT Best Kept Secret Is Optimization (Comments)0302192016-02-05T14:27:52-05:00IBM Connections - Blogsurn:lsid:ibm.com:blogs:comment-ead93997-2b23-425a-8ffd-08e370041786Re: Why PythonJeanFrancoisPuget2700028FGPactivefalseJeanFrancoisPuget2700028FGPactivefalseLikes2016-02-04T12:16:02-05:002016-02-04T12:16:02-05:00@mkeller
One thing that may help. Spark ML seems to be quite good at providing a Python API that matches the Scala API. for instance PCA is available in both:
Scala
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.ml.feature.PCA
Therefore, I recommend you rather use ML instead of MLLib.
Unfortunately, it does not solve everything. For instance, SVD is not (yet) available in Spark ML. It is only available, in Scala, in Spark MLLib.
@mkeller
One thing that may help. Spark ML seems to be quite good at providing a Python API that matches the Scala API. for instance PCA is available in both:
Scala
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.ml.feature.PCA...falsefalseRE: Why Python0urn:lsid:ibm.com:blogs:comment-8aa5aee6-9af2-4e5d-8213-0eb60969eb93Re: Why PythonJeanFrancoisPuget2700028FGPactivefalseJeanFrancoisPuget2700028FGPactivefalseLikes2016-02-03T11:13:41-05:002016-02-03T11:13:41-05:00@mkeller
You are right that Scala Spark API is ahead of the rest. This is an issue that Spark guys must solve if they want to get wide adoption IMHO.@mkeller
You are right that Scala Spark API is ahead of the rest. This is an issue that Spark guys must solve if they want to get wide adoption IMHO.falsefalseRE: Why Python0urn:lsid:ibm.com:blogs:comment-a920ef84-7877-48d2-a8bd-95c61d9ebbbbRe: Why Pythonmkeller310001ARE2activefalsemkeller310001ARE2activefalseLikes2016-02-03T11:03:54-05:002016-02-03T11:03:54-05:00Very interesting article, thanks for sharing your thoughts.
One question I had was concerning Scala. You mention that you included it in your list of languages because of its links to Spark. One of the reasons I have been looking into Scala recently is that some of the advanced features of Spark (e.g., GraphX, SVD, PCA, etc.) or not (yet) accessible via the Python or R APIs and that the non-Scala APIs will always lack some of the functionality that Spark provides via Scala. I get the impression that while Python is my go-to language for all things data science, having at least a basic understanding of Scala is definitely helpful when it comes to working with Spark. I'm wondering what your thoughts are on this.Very interesting article, thanks for sharing your thoughts.
One question I had was concerning Scala. You mention that you included it in your list of languages because of its links to Spark. One of the reasons I have been looking into Scala recently is that s...falsefalseRE: Why Python0urn:lsid:ibm.com:blogs:comment-097722b4-2b6a-4a70-b09c-d226f1856a52Re: Why PythonJeanFrancoisPuget2700028FGPactivefalseJeanFrancoisPuget2700028FGPactivefalseLikes2016-02-03T10:45:11-05:002016-02-03T10:45:11-05:00Thank you @vinomaster ! It is the second time in a week I hear that Quants in financial sector are using Python.
You are right that industry adoption leads to further industry adoption. That's how few languages got pervasive.Thank you @vinomaster ! It is the second time in a week I hear that Quants in financial sector are using Python.
You are right that industry adoption leads to further industry adoption. That's how few languages got pervasive.falsefalseRE: Why Python0urn:lsid:ibm.com:blogs:comment-61b7dc3e-96c8-4dae-8f23-687d43411b40Re: Why Pythonvinomaster100000QKFBactivefalsevinomaster100000QKFBactivefalseLikes2016-02-03T09:59:04-05:002016-02-03T09:59:04-05:00JFP - Thanks for addressing this topic. You have made a good case rooted in size of ecosystems and general adoption. I would add that commercial software vendors should also take into consideration industry adoption. For example, the use of Python amongst Quants in the Financial Sector is pervasive. JFP - Thanks for addressing this topic. You have made a good case rooted in size of ecosystems and general adoption. I would add that commercial software vendors should also take into consideration industry adoption. For example, the use of Python amongst Quan...falsefalseRE: Why Python0urn:lsid:ibm.com:blogs:comment-5b06e007-caa8-47af-9229-2e77ab0a8252Re: Prescriptive Analytics Modeling for Pythonpjcpjcpjc270006WTXBactivefalsepjcpjcpjc270006WTXBactivefalseLikes2016-01-29T13:38:35-05:002016-01-29T13:38:35-05:00FYI : a clean modeling of MIPs with Pandas probably requires resolution of the issue below.
https://github.com/pydata/pandas/issues/10695
I'll let Irv go into details if he wants. But my assessment of this is that either 10695 needs resolution or you need the specialized indexing class Irv has developed in order to implement the sort of slicing MIPs require in a readable manner.FYI : a clean modeling of MIPs with Pandas probably requires resolution of the issue below.
https://github.com/pydata/pandas/issues/10695
I'll let Irv go into details if he wants. But my assessment of this is that either 10695 needs resolution or you need th...falsefalseRE: Prescriptive Analytics Modeling for Python0urn:lsid:ibm.com:blogs:comment-d37d1d48-c91c-4bda-a5cf-7556bb81f15cRe: Prescriptive Analytics Modeling for PythonIrvL3100001UBCactivefalseIrvL3100001UBCactivefalseLikes2016-01-27T12:18:54-05:002016-01-27T12:18:54-05:00IMHO, the examples should all be using pandas as the way of maintaining the various tables one needs to handle as data input to an optimization problem. When building real optimization models with real data, you often have multiple tables of data (read from databases, spreadsheets, CSV files, or even HTML tables) that then have to be manipulated and sanitized for the model. pandas offers the ability to read those tables, do joins (really important!), does slicing, and you can even put vectors of decision variables as columns of a pandas DataFrame.
So I think you ought to write new examples using pandas, not numpy!
IMHO, the examples should all be using pandas as the way of maintaining the various tables one needs to handle as data input to an optimization problem. When building real optimization models with real data, you often have multiple tables of data (read from d...falsefalseRE: Prescriptive Analytics Modeling for Python0urn:lsid:ibm.com:blogs:comment-d0883530-673a-466c-a94e-2d564fb80755Re: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU FactorizationJeanFrancoisPuget2700028FGPactivefalseJeanFrancoisPuget2700028FGPactivefalseLikes2016-01-27T06:32:08-05:002016-01-27T06:32:08-05:00Hi Irv,
you're right, using a sparse matrix example would be more interesting probably. I'll think about it.Hi Irv,
you're right, using a sparse matrix example would be more interesting probably. I'll think about it.falsefalseRE: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU Factorization0urn:lsid:ibm.com:blogs:comment-f65475cc-df3c-4f60-9b1a-ece697ba8f04Re: Prescriptive Analytics Modeling for PythonJeanFrancoisPuget2700028FGPactivefalseJeanFrancoisPuget2700028FGPactivefalseLikes2016-01-27T06:16:44-05:002016-01-27T06:16:44-05:00You are right, my examples were written with a beta version that did not support Numpy. The ones on github were also written before Numpy support for many of them.
The point about Numpy is to be able to use pandas data frame columns or pandas series easily when setting constraints or objective. I'll write new ones that leverage Numpy when it makes sense. You are right, my examples were written with a beta version that did not support Numpy. The ones on github were also written before Numpy support for many of them.
The point about Numpy is to be able to use pandas data frame columns or pandas series easily ...falsefalseRE: Prescriptive Analytics Modeling for Python0urn:lsid:ibm.com:blogs:comment-6415380e-9bc8-4de7-8ff2-c514d2fc2fcbRe: Prescriptive Analytics Modeling for Pythonpjcpjcpjc270006WTXBactivefalsepjcpjcpjc270006WTXBactivefalseLikes2016-01-26T18:54:58-05:002016-01-26T18:54:58-05:00"Numpy is the standard way of representing data for data scientists "
But your examples appear to store data in standard Python (dicts or lists).
Not a criticism per se. I find standard Python data structures to be the most readable way to store MIP data as well. This is, of course, something of a matter of taste. But it would be fun to see these examples using Pandas or Numpy."Numpy is the standard way of representing data for data scientists "
But your examples appear to store data in standard Python (dicts or lists).
Not a criticism per se. I find standard Python data structures to be the most readable way to store MIP data as ...falsefalseRE: Prescriptive Analytics Modeling for Python0urn:lsid:ibm.com:blogs:comment-21416af9-1d5a-4584-9c3e-031d1d81538fRe: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU FactorizationIrvL3100001UBCactivefalseIrvL3100001UBCactivefalseLikes2016-01-19T22:49:38-05:002016-01-19T22:49:38-05:00Because of the cache issues and how well LAPACK is tuned to cache, I'm not sure how valuable this comparison is. For the work that you and I do in optimization, what would be more interesting is a comparison of a sparse LU factorization. Then you get to see what kinds of tradeoffs exist with respect to the differences in integer and floating point computation, as well as what kind of gain happens because you can do pointer based data structures in C. I'm hoping you'll find some way to do that comparison!Because of the cache issues and how well LAPACK is tuned to cache, I'm not sure how valuable this comparison is. For the work that you and I do in optimization, what would be more interesting is a comparison of a sparse LU factorization. Then you get to see ...falsefalseRE: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU Factorization0urn:lsid:ibm.com:blogs:comment-60379743-67c1-4954-9231-777b60993f0dRe: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU FactorizationJeanFrancoisPuget2700028FGPactivefalseJeanFrancoisPuget2700028FGPactivefalseLikes2016-01-16T16:16:53-05:002016-01-16T16:16:53-05:00The code is in the archive, link at the end of the post. The only modification I made since I archived is the inbounds decorator.The code is in the archive, link at the end of the post. The only modification I made since I archived is the inbounds decorator.falsefalseRE: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU Factorization0urn:lsid:ibm.com:blogs:comment-21e8f899-7442-4b93-8551-cf8dc8a1fbe8Re: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU Factorizationphiltor 060001QU4Wactivefalsephiltor 060001QU4WactivefalseLikes2016-01-16T12:31:36-05:002016-01-16T12:31:36-05:00Ah, I see you have a run_julia function there, so to do the setup and timing it must be something like:
function setup(N)
y = zeros(N)
A = zeros(N,N)
B = zeros(N,N)
@time run_julia(y,A,B,N)
end
Ah, I see you have a run_julia function there, so to do the setup and timing it must be something like:
function setup(N)
y = zeros(N)
A = zeros(N,N)
B = zeros(N,N)
@time run_julia(y,A,B,N)
end
falsefalseRE: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU Factorization0urn:lsid:ibm.com:blogs:comment-df4f1777-6a1c-4080-9cce-7ea06e91e293Re: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU Factorizationphiltor 060001QU4Wactivefalsephiltor 060001QU4WactivefalseLikes2016-01-16T12:16:39-05:002016-01-16T12:16:39-05:00Could you include the your code for running and timing the Julia version of your det_by_lu function?Could you include the your code for running and timing the Julia version of your det_by_lu function?falsefalseRE: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU Factorization0urn:lsid:ibm.com:blogs:comment-4335f809-c2da-488f-b1c8-42f52bd26c89Re: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU FactorizationJeanFrancoisPuget2700028FGPactivefalseJeanFrancoisPuget2700028FGPactivefalseLikes2016-01-16T10:26:43-05:002016-01-16T10:26:43-05:00Simd can add a factor 2 or more speedup, hence the memory bandwidth is not that limiting. But you are right, a better implementation would need to be cache friendly, and it is provided by LAPACK.Simd can add a factor 2 or more speedup, hence the memory bandwidth is not that limiting. But you are right, a better implementation would need to be cache friendly, and it is provided by LAPACK.falsefalseRE: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU Factorization0urn:lsid:ibm.com:blogs:comment-f044cce7-f5bb-40c8-9abe-4e09ccd38210Re: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU FactorizationInsideLoop 310001RS3HactivefalseInsideLoop 310001RS3HactivefalseLikes2016-01-16T07:44:43-05:002016-01-16T07:44:43-05:00Hi Jean-Francois,
Before going to SIMD and multithreading, the most important point is to make the algorithm cache-friendly. I have no idea on how difficult this is for a LU decomposition. You might look at the LAPACK source code. If you don't do that, SIMD and multithreading might be useless, as the memory access would still be your bottleneck.
Also, it would be interesting to find out what is the "Scipy" function behind the function "lu". I am not a Python programmer so I can't be sure, be there is a Fortran file in Scipy : https://github.com/scipy/scipy/blob/v0.15.1/scipy/linalg/src/lu.f . I believe this is the one which is called from lu.
Hi Jean-Francois,
Before going to SIMD and multithreading, the most important point is to make the algorithm cache-friendly. I have no idea on how difficult this is for a LU decomposition. You might look at the LAPACK source code. If you don't do that, SIMD a...falsefalseRE: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU Factorization0urn:lsid:ibm.com:blogs:comment-f9fb20ea-22be-4725-baac-7263a30f17c5Re: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU FactorizationJeanFrancoisPuget2700028FGPactivefalseJeanFrancoisPuget2700028FGPactivefalseLikes2016-01-16T07:15:15-05:002016-01-16T07:15:15-05:00You are right, my mistake, the lu_factor function is a wrapper to the *GETRF routines from LAPACK.
You are also right that
1. This algorithm is naive, and it can be wrong numerically if the pivot is close to 0.
2. I have not used simd or parallelism yet. This is next.
3. This is not a language performance comparison, just a single data point.You are right, my mistake, the lu_factor function is a wrapper to the *GETRF routines from LAPACK.
You are also right that
1. This algorithm is naive, and it can be wrong numerically if the pivot is close to 0.
2. I have not used simd or parallelism yet. Thi...falsefalseRE: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU Factorization0urn:lsid:ibm.com:blogs:comment-63e1652c-6f4d-4349-be71-2b87d7d53435Re: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU FactorizationInsideLoop 310001RS3HactivefalseInsideLoop 310001RS3HactivefalseLikes2016-01-16T06:41:13-05:002016-01-16T06:41:13-05:00Hi Jean-Francois,
Thanks for your article. I hope you won't mind if I play the devil's advocate here.
So there is no misunderstanding, BLAS does not provide any LU factorization algorithm. LAPACK is the library which is called when you call lu_factor from Scipy.
LU factorization algorithms are tricky to get right, and we might classify them depending upon:
1) No pivoting // partial pivoting // full pivoting
The algorithm presented here does not do any pivoting which is extremely dangerous in practice as L and U could have condition numbers way larger than A which could make the factorization useless. Partial pivoting usually keeps L and U with a condition number "not to big" compared to A. Full pivoting keeps the condition number for L and U even lower. No pivoting is faster than partial pivoting which is faster than full pivoting. Usually, people use partial pivoting as it is a good tradeoff between speed and accuracy. This is the default strategy used in LAPACK.
2) Not optimized for cache // optimized for cache
When the size of the matrix does not fit into the cache, the algorithm presented here becomes limited by the memory bandwidth: it is not optimized for the cache of your CPU. For n = 3000, a nxn matrix of doubles needs about 70 MB, far more than the cache of our computers.
As you can see on your last graph, your LAPACK implementation used by scipy through the lu_factor function outperforms the other ones as soon as your matrix does not fit into your L1 cache anymore (for n = 65, a nxn matrix of doubles needs 32 kB of memory which is most likely the size of your L1 cache).
3) Single threaded // Multithreaded
The algorithm might use one or more threads.
The comparisons in between Pure Python, C (Fortran ordered), Numba and Cython seems fair to me as all of them use: no pivoting, not optimized for cache, single thread. It shows that Numba and Cython can reach the speed of C here. But bear in mind that the implementation is not optimized for cache so it "slows down" any language.
What would be interesting to know is:
- Can Numba and Cython handle the comparison with C if the implementation is optimized for cache?
- Can Numba and Cython generate mulithreaded code easily?
I am quite sure that we can't answer yes to both of these questions.
In a nutshell, I think that this benchmark should not be used to compare "languages". The reason is that the algorithm is implemented in a way in which the speed is limited by the bandwidth of your computer, not the "speed" of the language. That's why I find the title misleading.
But it shows that an LU factorization is tricky to implement. You should use well implemented ones, both in term of accuracy and speed. And you can access those implementations from any language. Hi Jean-Francois,
Thanks for your article. I hope you won't mind if I play the devil's advocate here.
So there is no misunderstanding, BLAS does not provide any LU factorization algorithm. LAPACK is the library which is called when you call lu_factor from S...falsefalseRE: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU Factorization0urn:lsid:ibm.com:blogs:comment-6c7640c9-d41d-4c5e-8ae3-0958db538c88Re: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU FactorizationJeanFrancoisPuget2700028FGPactivefalseJeanFrancoisPuget2700028FGPactivefalseLikes2016-01-16T04:03:14-05:002016-01-16T04:03:14-05:00Thank you, I had seen this before but never actually used it.
You make me switch!Thank you, I had seen this before but never actually used it.
You make me switch!falsefalseRE: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU Factorization0urn:lsid:ibm.com:blogs:comment-f4d3ce84-01e9-4177-9bd7-f84b0c211ebcRe: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU Factorizationcjrh 310001RRCHactivefalsecjrh 310001RRCHactivefalseLikes2016-01-15T19:44:23-05:002016-01-15T19:44:23-05:00Great article! I'm sure the author already knows this, but for the benefit for other readers: the Cython code can be modernised somewhat by replacing the explicit numpy types with memory views, i.e. Replace
np.ndarray[double,ndim=1] y
with:
double[:] y
This will also allow other data structures that support the buffer protocol to be passed to the routine. In fact, with this Cython code you don't need any Numpy imports at all, and they can be removed from your Cython code once you switch over to memory views.Great article! I'm sure the author already knows this, but for the benefit for other readers: the Cython code can be modernised somewhat by replacing the explicit numpy types with memory views, i.e. Replace
np.ndarray[double,ndim=1] y
with:
double[:] y
T...falsefalseRE: A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU Factorization0urn:lsid:ibm.com:blogs:comment-02edcbb4-5a92-4146-a7e5-6a07ddd07d2eRe: How To Make Python Run As Fast As JuliaJeanFrancoisPuget2700028FGPactivefalseJeanFrancoisPuget2700028FGPactivefalseLikes2015-12-20T07:23:34-05:002015-12-20T07:23:34-05:00Hi Vitaly, thank you, I am glad if you find this useful.Hi Vitaly, thank you, I am glad if you find this useful.falsefalseRE: How To Make Python Run As Fast As Julia0urn:lsid:ibm.com:blogs:comment-baa27859-548e-409d-b74a-9d66aa2181c4Re: How To Make Python Run As Fast As JuliaVitaliyDre 310001NT62activefalseVitaliyDre 310001NT62activefalseLikes2015-12-19T13:11:23-05:002015-12-19T13:11:23-05:00Thank you @JeanFrancoisPuget This article was very useful for me as python developer, I will use some of this tricks in my work.Thank you @JeanFrancoisPuget This article was very useful for me as python developer, I will use some of this tricks in my work.falsefalseRE: How To Make Python Run As Fast As Julia0urn:lsid:ibm.com:blogs:comment-53a9d51e-a75f-4601-a924-c191528c3e97Re: Python Is Not C: Take TwoJeanFrancoisPuget2700028FGPactivefalseJeanFrancoisPuget2700028FGPactivefalseLikes2015-12-18T03:06:23-05:002015-12-18T03:06:23-05:00Interesting.
The tree implementation should scale better indeed as it is O(log(N)) where brute force is O(N). But finding the nearest neighbor isn't as simple as going down one branch of that tree. If the input point is very close to one of the tree boundary then you have to go down both side of the boundary.
I haven't done a formal analysis but I'd say you need to perform 2^d dives in the tree in the worst case, where d is the number of axis.
A simple case is a 2D tree with four leaves evenly space. The tree splits the overall space into 4 regions, atop left, top right, bottom left, and bottom right. Then, if the input point is exactly at the center, then all 4 regions must be checked.
Is it what you had to implement?
Regarding the timing, 30 ms is to find nearest neighbor for 400 way points. Average per point is therefore about 80 nanosecond. I'll make that clearer in the post.Interesting.
The tree implementation should scale better indeed as it is O(log(N)) where brute force is O(N). But finding the nearest neighbor isn't as simple as going down one branch of that tree. If the input point is very close to one of the tree boundary ...falsefalseRE: Python Is Not C: Take Two0urn:lsid:ibm.com:blogs:comment-12a09ce4-a4e6-4f49-b4d1-d1b8f4b38c22Re: Python Is Not C: Take Twojheriko 310001NJ8Qactivefalsejheriko 310001NJ8QactivefalseLikes2015-12-17T23:25:38-05:002015-12-17T23:25:38-05:00I've had to solve this problem before in a slightly different context, but with /millions/ of data points... I did it in C++, but I took nearly the same approach as the one here ... although I skipped the naive and vectorised approaches and went straight for spatial partitioning.
I wrote another program to transform the point data into a kd-tree. a kd tree is usually perfect for points because you can simply sort them along an axis, use the average along that axis to decide a split point and then recurse by working on each half of the data with a different axis, continually, until you end up with leaf nodes containing only one point each... and build as close to a perfectly balanced tree as is possible. If you understand this you should realise that taking the tree down to its limit like that, you can discard the manhattan distance check altogether, since it is embedded in the tree (the planes of the kd tree describe a voronoi diagram where manhattan distance is used, and not euclidean).
Once you have the data laid out in this way it should take only log n comparisions (in this case 16) to find the answer. If this is taking milliseconds then there is something else wrong imo... 16 branches is not a milliseconds worth of work, even in python. I would assume that the kd tree algorithms from the libraries are doing that... but if its taking that much more than a millisecond to run, i'd be skeptical that you couldn't do better by rolling your own with the most dumbass and simple code possible - under the assumption that some hidden, and needless complexity (perhaps because the libraries are meant to provide super generic solutions - e.g. metric='minkowski', needing to specify the norm etc.) is causing the slow down.
Unwinding longitude and latitude into x/y/z might help too...
Then again... 30ms is fast enough for a lot of applications. Even if it sounds like an entire frame interval to me... ;)I've had to solve this problem before in a slightly different context, but with /millions/ of data points... I did it in C++, but I took nearly the same approach as the one here ... although I skipped the naive and vectorised approaches and went straight for s...falsefalseRE: Python Is Not C: Take Two0urn:lsid:ibm.com:blogs:comment-8feff4c6-8e03-4eb8-b699-fed12258bb67Re: Installing Cython For Anaconda On WindowsJeanFrancoisPuget2700028FGPactivefalseJeanFrancoisPuget2700028FGPactivefalseLikes2015-12-17T22:21:41-05:002015-12-17T22:21:41-05:00flutefreak7. Thanks. I did install this https://www.microsoft.com/en-us/download/details.aspx?id=44266 to make my system Python install work fine. But it wasn't enough for Anaconda to work.
I'll update my post with the information you provide, esp the version numbering.
Note that my override of majorVersion is correct. The original code did subtract 6 to accommodate the offset of version returned by the Python version info, and the actual Visual Studio major version.flutefreak7. Thanks. I did install this https://www.microsoft.com/en-us/download/details.aspx?id=44266 to make my system Python install work fine. But it wasn't enough for Anaconda to work.
I'll update my post with the information you provide, esp the vers...falsefalseRE: Installing Cython For Anaconda On Windows0urn:lsid:ibm.com:blogs:comment-015ae909-3d69-4e2c-bb45-b8fda6650b75Re: Installing Cython For Anaconda On Windowsflutefreak7 310001NHBNactivefalseflutefreak7 310001NHBNactivefalseLikes2015-12-17T17:16:25-05:002015-12-17T17:16:25-05:00The MSC v.1500 is actually a reference to the Microsoft Visual C++ compiler which goes with Visual Studio major version 15 (aka "Visual Studio 2008"), which corresponds to MSVC++ 9.0. Here's a quick reference: http://stackoverflow.com/a/2676904.The MSC v.1500 is actually a reference to the Microsoft Visual C++ compiler which goes with Visual Studio major version 15 (aka "Visual Studio 2008"), which corresponds to MSVC++ 9.0. Here's a quick reference: http://stackoverflow.com/a/2676904.falsefalseRE: Installing Cython For Anaconda On Windows0urn:lsid:ibm.com:blogs:comment-51599ccb-fe31-4adf-a081-89e0d9a6dc08Re: Python Is Not C: Take TwoJeanFrancoisPuget2700028FGPactivefalseJeanFrancoisPuget2700028FGPactivefalseLikes2015-12-17T10:54:49-05:002015-12-17T10:54:49-05:00No good reason actually. I started with John's function which was using math only. I replaced math by numpy where I replaced the scalar argument by an array. I could have replaced all math calls as you suggest. No good reason actually. I started with John's function which was using math only. I replaced math by numpy where I replaced the scalar argument by an array. I could have replaced all math calls as you suggest. falsefalseRE: Python Is Not C: Take Two0urn:lsid:ibm.com:blogs:comment-64aaf462-0f5d-4531-babe-14fe7e7125caRe: Python Is Not C: Take TwoBBBS 310001FC6PactivefalseBBBS 310001FC6PactivefalseLikes2015-12-17T06:41:16-05:002015-12-17T06:41:16-05:00Why did you use both math and numpy?
cos = (math.sin(phi1)*np.sin(phi2)*np.cos(theta1 - theta2) +
math.cos(phi1)*np.cos(phi2))
Why not
cos = (np.sin(phi1)*np.sin(phi2)*np.cos(theta1 - theta2) +
np.cos(phi1)*np.cos(phi2))
Why did you use both math and numpy?
cos = (math.sin(phi1)*np.sin(phi2)*np.cos(theta1 - theta2) +
math.cos(phi1)*np.cos(phi2))
Why not
cos = (np.sin(phi1)*np.sin(phi2)*np.cos(theta1 - theta2) +
np.cos(phi1)*np.cos(phi2))
falsefalseRE: Python Is Not C: Take Two0urn:lsid:ibm.com:blogs:comment-2475342c-9dfe-4b95-9625-4f67db7e632bRe: How To Make Python Run As Fast As JuliaDNF310001K49DactivefalseDNF310001K49DactivefalseLikes2015-12-16T16:31:58-05:002015-12-16T16:31:58-05:00But I don't really agree it's an implementer point of view. I am a user, not an 'implementer', and my thinking goes as follows:
I will not find a benchmark that actually implements the exact problem I'm trying to solve (except very rarely). Instead, *as a user*, I would like to see what the efficiency is for the various strategies I might use when implementing my problem. Knowing how good a language is at looping vs vectorized performance *in general* is interesting to me as a user.But I don't really agree it's an implementer point of view. I am a user, not an 'implementer', and my thinking goes as follows:
I will not find a benchmark that actually implements the exact problem I'm trying to solve (except very rarely). Instead, *as a use...falsefalseRE: How To Make Python Run As Fast As Julia0urn:lsid:ibm.com:blogs:comment-acbd21a5-daab-49a6-afb2-9b97ff931ae5Re: How To Make Python Run As Fast As JuliaJeanFrancoisPuget2700028FGPactivefalseJeanFrancoisPuget2700028FGPactivefalseLikes2015-12-16T13:19:58-05:002015-12-16T13:19:58-05:00There is indeed a disagreement about the purposes of the benchmarks. I see at least two purposes at stake here.
1. A user point of view, which is to see how t best accomplish things in a given language. It is the result of various tradeoffs, including this: balance the time and effort to code something with the efficiency you get. That's the view of most Python users reacting to my post. We don't mind using Python libraries, even if they aren't written in 'pure' Python. Actually, the massive set of existing Python libraries is probably one key reason for its success.
2. A language implementer point of view, which focuses on how elementary language operations perform. That's the purpose of Julia micro benchmarks I think.
If people do not agree on the yardstick they use, then the discussion is not going to be fruitful. This disagreement explains most of the comments I saw until now.
I am using the 'user point of view' in my post, but I do have issues with Julia micro benchmarks even when using the 'language implementer point of view. 64 bits vs arbitrary precision is one. The other is the use of libraries written in foreign languages. Where do we stop? Julia team did use Numpy in some benchmarks. Why not use it everywhere? Julialang calls Fortran libraries like LAPACK in these micro benchmarks. It would be cleaner to only use 'pure' Julia everywhere. At least, it would be clear that they use the 'language implementer' point of view.
There is indeed a disagreement about the purposes of the benchmarks. I see at least two purposes at stake here.
1. A user point of view, which is to see how t best accomplish things in a given language. It is the result of various tradeoffs, including this:...falsefalseRE: How To Make Python Run As Fast As Julia0