[python] C, Clojure, Python, Ruby, Scala 등의 벤치 마크 해석 [닫기]

Question 1

부인 성명

나는 인공적인 벤치 마크가 악하다는 것을 알고 있습니다. 매우 구체적인 좁은 상황에 대해서만 결과를 표시 할 수 있습니다. 나는 어리석은 벤치 때문에 한 언어가 다른 언어보다 낫다고 생각하지 않습니다. 그러나 결과가 왜 그렇게 다른지 궁금합니다. 하단의 내 질문을 참조하십시오.

수학 벤치 마크 설명

벤치 마크는 6만큼 다른 소수 쌍을 찾기위한 간단한 수학 계산입니다 ( 섹시 소수 라고 함 ) 예를 들어 100 미만의 섹시 소수는 다음과 같습니다.(5 11) (7 13) (11 17) (13 19) (17 23) (23 29) (31 37) (37 43) (41 47) (47 53) (53 59) (61 67) (67 73) (73 79) (83 89) (97 103)

결과 표

표에서 : 계산 시간 (초
)

  Sexy primes up to:        10k      20k      30k      100k

  Bash                    58.00   200.00     [*1]      [*1]

  C                        0.20     0.65     1.42     15.00

  Clojure1.4               4.12     8.32    16.00    137.93

  Clojure1.4 (optimized)   0.95     1.82     2.30     16.00

  Factor                    n/a      n/a    15.00    180.00

  Python2.7                1.49     5.20    11.00       119

  Ruby1.8                  5.10    18.32    40.48    377.00

  Ruby1.9.3                1.36     5.73    10.48    106.00

  Scala2.9.2               0.93     1.41     2.73     20.84

  Scala2.9.2 (optimized)   0.32     0.79     1.46     12.01

[* 1]-시간이 얼마나 걸릴지 상상이됩니다

코드 목록

씨:

int isprime(int x) {
  int i;
  for (i = 2; i < x; ++i)
    if (x%i == 0) return 0;
  return 1;
}

void findprimes(int m) {
  int i;
  for ( i = 11; i < m; ++i)
    if (isprime(i) && isprime(i-6))
      printf("%d %d\n", i-6, i);
}

main() {
    findprimes(10*1000);
}

루비:

def is_prime?(n)
  (2...n).all?{|m| n%m != 0 }
end

def sexy_primes(x)
  (9..x).map do |i|
    [i-6, i]
  end.select do |j|
    j.all?{|j| is_prime? j}
  end
end

a = Time.now
p sexy_primes(10*1000)
b = Time.now
puts "#{(b-a)*1000} mils"

스칼라 :

def isPrime(n: Int) =
  (2 until n) forall { n % _ != 0 }

def sexyPrimes(n: Int) =
  (11 to n) map { i => List(i-6, i) } filter { _ forall(isPrime(_)) }

val a = System.currentTimeMillis()
println(sexyPrimes(100*1000))
val b = System.currentTimeMillis()
println((b-a).toString + " mils")

Scala 최적화 isPrime(Clojure 최적화와 같은 아이디어) :

import scala.annotation.tailrec

@tailrec // Not required, but will warn if optimization doesn't work
def isPrime(n: Int, i: Int = 2): Boolean =
  if (i == n) true
  else if (n % i != 0) isPrime(n, i + 1)
  else false

클로저 :

(defn is-prime? [n]
  (every? #(> (mod n %) 0)
    (range 2 n)))

(defn sexy-primes [m]
  (for [x (range 11 (inc m))
        :let [z (list (- x 6) x)]
        :when (every? #(is-prime? %) z)]
      z))

(let [a (System/currentTimeMillis)]
  (println (sexy-primes (* 10 1000)))
  (let [b (System/currentTimeMillis)]
    (println (- b a) "mils")))

Clojure 최적화 is-prime?:

(defn ^:static is-prime? [^long n]
  (loop [i (long 2)]
    (if (= (rem n i) 0)
      false
      (if (>= (inc i) n) true (recur (inc i))))))

파이썬

import time as time_

def is_prime(n):
  return all((n%j > 0) for j in xrange(2, n))

def primes_below(x):
  return [[j-6, j] for j in xrange(9, x+1) if is_prime(j) and is_prime(j-6)]

a = int(round(time_.time() * 1000))
print(primes_below(10*1000))
b = int(round(time_.time() * 1000))
print(str((b-a)) + " mils")

인자

MEMO:: prime? ( n -- ? )
n 1 - 2 [a,b] [ n swap mod 0 > ] all? ;

MEMO: sexyprimes ( n n -- r r )
[a,b] [ prime? ] filter [ 6 + ] map [ prime? ] filter dup [ 6 - ] map ;

5 10 1000 * sexyprimes . .

Bash (zsh) :

#!/usr/bin/zsh
function prime {
  for (( i = 2; i < $1; i++ )); do
    if [[ $[$1%i] == 0 ]]; then
      echo 1
      exit
    fi
  done
  echo 0
}

function sexy-primes {
  for (( i = 9; i <= $1; i++ )); do
    j=$[i-6]
    if [[ $(prime $i) == 0 && $(prime $j) == 0 ]]; then
      echo $j $i
    fi
  done
}

sexy-primes 10000

질문

Scala가 왜 그렇게 빠른가요? 정적 타이핑 때문 입니까? 아니면 JVM을 매우 효율적으로 사용하고 있습니까?
~~Ruby와 Python의 차이점은 무엇일까요? 나는이 둘이 완전히 다르지 않다고 생각했다. 내 코드가 잘못되었을 수 있습니다. 저를 깨달으십시오! 감사.~~ UPD 예, 내 코드에 오류가 있습니다. Python과 Ruby 1.9는 거의 동일합니다.
Ruby 버전 간의 생산성이 정말 인상적입니다.
유형 선언을 추가하여 Clojure 코드를 최적화 할 수 있습니까? 도움이 될까요?

Question 2

대략적인 답변 :

Scala의 정적 타이핑은 여기에서 상당히 도움이됩니다. 이것은 너무 많은 추가 노력없이 JVM을 매우 효율적으로 사용한다는 것을 의미합니다.
Ruby / Python 차이에 대해 정확히 확신 할 수는 없지만 (2...n).all?함수 is-prime?가 Ruby에서 매우 잘 최적화 될 가능성이 있다고 생각합니다 (편집 : 실제로 이것이 사실 인 것처럼 들립니다. 자세한 내용은 Julian의 답변을 참조하십시오 …)
Ruby 1.9.3은 훨씬 더 최적화되었습니다.
Clojure 코드는 확실히 많이 가속화 될 수 있습니다! Clojure는 기본적으로 동적이지만 필요한 경우 많은 경우에 유형 힌트, 기본 수학 등을 사용하여 Scala / 순수 Java 속도에 근접 할 수 있습니다.

Clojure 코드에서 가장 중요한 최적화는 is-prime?다음과 같이 형식화 된 기본 수학을 사용하는 것입니다.

(set! *unchecked-math* true) ;; at top of file to avoid using BigIntegers

(defn ^:static is-prime? [^long n]
  (loop [i (long 2)]
    (if (zero? (mod n i))
      false
      (if (>= (inc i) n) true (recur (inc i))))))

이 개선으로 Clojure가 0.635 초 만에 10k를 완료하게되었습니다 (즉, 목록에서 두 번째로 빠른 스칼라를 이겼습니다).

추신 : 어떤 경우에는 벤치 마크 내부에 코드를 인쇄하고 있다는 점에 유의하십시오. 특히 print처음 과 같은 기능을 사용하면 IO 하위 시스템이 초기화 되는 경우 등 결과가 왜곡되므로 좋은 생각이 아닙니다 !

Question 3

다음은 동일한 기본 알고리즘을 사용하는 빠른 Clojure 버전입니다.

(set! *unchecked-math* true)

(defn is-prime? [^long n]
  (loop [i 2]
    (if (zero? (unchecked-remainder-int n i))
      false
      (if (>= (inc i) n)
        true
        (recur (inc i))))))

(defn sexy-primes [m]
  (for [x (range 11 (inc m))
        :when (and (is-prime? x) (is-prime? (- x 6)))]
    [(- x 6) x]))

내 컴퓨터에서 원본보다 약 20 배 빠르게 실행됩니다. 다음은 1.5의 새로운 감속기 라이브러리를 활용하는 버전입니다 (Java 7 또는 JSR 166 필요).

(require '[clojure.core.reducers :as r]) ;'

(defn sexy-primes [m]
  (->> (vec (range 11 (inc m)))
       (r/filter #(and (is-prime? %) (is-prime? (- % 6))))
       (r/map #(list (- % 6) %))
       (r/fold (fn ([] []) ([a b] (into a b))) conj)))

이것은 원본보다 약 40 배 빠르게 실행됩니다. 내 컴퓨터에서는 1.5 초에 10 만입니다.

Question 4

나는 단지 # 2에 답할 것이다. 왜냐하면 내가 원격으로 말할 수있는 유일한 것이기 때문이다. 그러나 파이썬 코드의 경우에서 중간 목록을 만들고있는 is_prime반면, Ruby에서는 사용 .map하고있다. all반복.

다음으로 변경하는 경우 is_prime:

def is_prime(n):
    return all((n%j > 0) for j in range(2, n))

그들은 동등합니다.

파이썬을 더 최적화 할 수는 있지만, 제 루비는 제가 언제 더 많은 이점을 제공했는지 알기에 충분하지 않습니다 (예 : xrange 하면 Python이 내 컴퓨터에서 승리하게되지만 사용했던 Ruby 범위가 다음을 생성하는지 기억하지 못합니다. 메모리의 전체 범위).

편집 : 너무 어리석지 않고 Python 코드를 다음과 같이 만듭니다.

import time

def is_prime(n):
    return all(n % j for j in xrange(2, n))

def primes_below(x):
    return [(j-6, j) for j in xrange(9, x + 1) if is_prime(j) and is_prime(j-6)]

a = int(round(time.time() * 1000))
print(primes_below(10*1000))
b = int(round(time.time() * 1000))
print(str((b-a)) + " mils")

더 이상 변하지 않는데, 나를 위해 1.5 초로 설정하고, 매우 어리석게도 PyPy로 실행하면 10K의 경우 .3s, 100K의 경우 21 초가됩니다.

Question 5

isPrime방법을 다음과 같이 수정하여 Scala를 훨씬 빠르게 만들 수 있습니다.

  def isPrime(n: Int, i: Int = 2): Boolean =
    if (i == n) true
    else if (n % i != 0) isPrime(n, i + 1)
    else false

간결하지는 않지만 프로그램은 40 %의 시간에 실행됩니다!

우리는 불필요한 것을 잘라냅니다. Range 익명 Function객체를 Scala 컴파일러는 tail-recursion을 인식하고이를 while-loop로 변환합니다. JVM이 어느 정도 최적의 기계 코드로 바뀔 수 있으므로 C에서 너무 멀리 떨어져서는 안됩니다. 버전.

참고 항목 : Scala에서 for-comprehensions 및 루프를 최적화하는 방법은 무엇입니까?

Question 6

재미를 위해 병렬 및 비 병렬의 스칼라 버전이 있습니다. (듀얼 코어 컴퓨팅에서 병렬 버전은 335ms, 병렬 버전은 655ms가 걸립니다)

object SexyPrimes {
  def isPrime(n: Int): Boolean =
    (2 to math.sqrt(n).toInt).forall{ n%_ != 0 }

  def isSexyPrime(n: Int): Boolean = isPrime(n) && isPrime(n-6)

  def findPrimesPar(n: Int) {
    for(k <- (11 to n).par)
      if(isSexyPrime(k)) printf("%d %d\n",k-6,k)
  }

  def findPrimes(n: Int) {
    for(k <- 11 to n)
      if(isSexyPrime(k)) printf("%d %d\n",k-6,k)
  }


  def timeOf(call : =>Unit) {
    val start = System.currentTimeMillis
    call
    val end = System.currentTimeMillis
    println((end-start)+" mils")
  }

  def main(args: Array[String]) {
    timeOf(findPrimes(100*1000))
    println("------------------------")
    timeOf(findPrimesPar(100*1000))
  }
}

편집 : Emil H 의 제안에 따라 IO 및 jvm 워밍업의 영향을 피하기 위해 코드를 변경했습니다.

결과는 내 계산에 표시됩니다.

목록 (3432, 1934, 3261, 1716, 3229, 1654, 3214, 1700)

object SexyPrimes {
  def isPrime(n: Int): Boolean =
    (2 to math.sqrt(n).toInt).forall{ n%_ != 0 }

  def isSexyPrime(n: Int): Boolean = isPrime(n) && isPrime(n-6)

  def findPrimesPar(n: Int) {
    for(k <- (11 to n).par)
      if(isSexyPrime(k)) ()//printf("%d %d\n",k-6,k)
  }

  def findPrimes(n: Int) {
    for(k <- 11 to n)
      if(isSexyPrime(k)) ()//printf("%d %d\n",k-6,k)
  }


  def timeOf(call : =>Unit): Long = {
    val start = System.currentTimeMillis
    call
    val end = System.currentTimeMillis
    end - start
  }

  def main(args: Array[String]) {
    val xs = timeOf(findPrimes(1000*1000))::timeOf(findPrimesPar(1000*1000))::
             timeOf(findPrimes(1000*1000))::timeOf(findPrimesPar(1000*1000))::
             timeOf(findPrimes(1000*1000))::timeOf(findPrimesPar(1000*1000))::
             timeOf(findPrimes(1000*1000))::timeOf(findPrimesPar(1000*1000))::Nil
    println(xs)
  }
}

Question 7

벤치 마크는 신경 쓰지 마십시오. 이 문제에 관심이 생겼고 빠르게 조정했습니다. 이것은 lru_cache함수를 메모하는 데코레이터를 사용합니다 . 전화를 걸면 is_prime(i-6)기본적으로 프라임 수표를 무료로받습니다. 이 변경은 작업을 대략 절반으로 줄입니다. 또한, 우리는range() 홀수로만 전화를 걸 작업을 다시 대략 절반으로 줄일 수 있습니다.

http://en.wikipedia.org/wiki/Memoization

http://docs.python.org/dev/library/functools.html

이를 위해서는 Python 3.2 이상이 필요합니다. lru_cache 하지만 .NET Framework를 제공하는 Python 레시피를 설치하면 이전 Python과 함께 작동 할 수 있습니다 lru_cache. Python 2.x를 사용 xrange()하는 경우 실제로 대신 사용해야 합니다.range() .

http://code.activestate.com/recipes/577479-simple-caching-decorator/

from functools import lru_cache
import time as time_

@lru_cache()
def is_prime(n):
    return n%2 and all(n%i for i in range(3, n, 2))

def primes_below(x):
    return [(i-6, i) for i in range(9, x+1, 2) if is_prime(i) and is_prime(i-6)]

correct100 = [(5, 11), (7, 13), (11, 17), (13, 19), (17, 23), (23, 29),
        (31, 37), (37, 43), (41, 47), (47, 53), (53, 59), (61, 67), (67, 73),
        (73, 79), (83, 89)]
assert(primes_below(100) == correct100)

a = time_.time()
print(primes_below(30*1000))
b = time_.time()

elapsed = b - a
print("{} msec".format(round(elapsed * 1000)))

위의 내용은 편집하는 데 매우 짧은 시간이 걸렸습니다. 나는 한 단계 더 나아가 소수 테스트가 소수만 시도하고 테스트되는 숫자의 제곱근까지만 시도하도록 결정했습니다. 내가 한 방식은 순서대로 숫자를 확인하는 경우에만 작동하므로 진행되는 동안 모든 소수를 누적 할 수 있습니다. 하지만이 문제는 이미 순서대로 숫자를 확인하고 있었으므로 괜찮 았습니다.

내 노트북 (특별한 것은 없습니다. 프로세서는 1.5GHz AMD Turion II “K625″임)에서이 버전은 8 초 이내에 100K에 대한 응답을 생성했습니다.

from functools import lru_cache
import math
import time as time_

known_primes = set([2, 3, 5, 7])

@lru_cache(maxsize=128)
def is_prime(n):
    last = math.ceil(math.sqrt(n))
    flag = n%2 and all(n%x for x in known_primes if x <= last)
    if flag:
        known_primes.add(n)
    return flag

def primes_below(x):
    return [(i-6, i) for i in range(9, x+1, 2) if is_prime(i) and is_prime(i-6)]

correct100 = [(5, 11), (7, 13), (11, 17), (13, 19), (17, 23), (23, 29),
        (31, 37), (37, 43), (41, 47), (47, 53), (53, 59), (61, 67), (67, 73),
        (73, 79), (83, 89)]
assert(primes_below(100) == correct100)

a = time_.time()
print(primes_below(100*1000))
b = time_.time()

elapsed = b - a
print("{} msec".format(round(elapsed * 1000)))

위의 코드는 Python, Ruby 등으로 작성하기가 매우 쉽지만 C에서는 더 고통 스러울 것입니다.

유사한 트릭을 사용하도록 다른 버전을 다시 작성하지 않고는이 버전의 숫자를 다른 버전의 숫자와 비교할 수 없습니다. 나는 여기서 아무것도 증명하려고하지 않습니다. 나는 문제가 재미 있다고 생각했고 어떤 종류의 쉬운 성능 향상을 얻을 수 있는지보고 싶었습니다.

Question 8

포트란을 잊지 마세요! (대부분 농담이지만 C와 비슷한 성능을 기대합니다). 느낌표가있는 문장은 선택 사항이지만 좋은 스타일입니다. ( !포트란 90의 주석 문자)

logical function isprime(n)
IMPLICIT NONE !
integer :: n,i
do i=2,n
   if(mod(n,i).eq.0)) return .false.
enddo
return .true.
end

subroutine findprimes(m)
IMPLICIT NONE !
integer :: m,i
logical, external :: isprime

do i=11,m
   if(isprime(i) .and. isprime(i-6))then
      write(*,*) i-6,i
   endif
enddo
end

program main
findprimes(10*1000)
end