arXiv:math/0508185v1 [math.NT] 10 Aug 2005

PRIMES IN TUPLES I
D. A. GOLDSTON, J. PINTZ, AND C. Y. YILDIRIM
Abstract. We introduce a method for showing that there exist prime numbers which are very close together. The method depends on the level of distribution of primes in arithmetic progressions. Assuming the Elliott-Halberstam
conjecture, we prove that there are infinitely often primes differing by 16 or
less. Even a much weaker conjecture implies that there are infinitely often
primes a bounded distance apart. Unconditionally, we prove that there exist
consecutive primes which are closer than any arbitrarily small multiple of the
average spacing, that is,
pn+1 − pn
= 0.
lim inf
n→∞
log pn
This last result will be considerably improved in a later paper.

1. Introduction
One of the most important unsolved problems in number theory is to establish
the existence of infinitely many prime tuples. Not only is this problem believed to
be difficult, but it has also earned the reputation among most mathematicians in
the field as hopeless in the sense that there is no known unconditional approach for
tackling the problem. The purpose of this paper, the first in a series, is to provide
what we believe is a method which could lead to a partial solution for this problem.
At present, our results on primes in tuples are conditional on information about
the distribution of primes in arithmetic progressions. However, the information
needed to prove that there are infinitely often two primes in a given k-tuple for
sufficiently large k does not seem to be too far beyond the currently known results.
Moreover, we can gain enough in the argument by averaging over many tuples to
obtain unconditional results concerning small gaps between primes which go far
beyond anything that has been proved before. Thus, we are able to prove the
existence of very small gaps between primes which, however, go slowly to infinity
with the size of the primes.
The information on primes we utilize in our method is often referred to as the
level of distribution of primes in arithmetic progressions. Let

log n if n prime,
(1.1)
θ(n) =
0
otherwise,
and consider the counting function
(1.2)

X

θ(N ; q, a) =

θ(n).

n≤N
n≡a(mod q)

The first author was supported by NSF grant DMS-0300563, the NSF Focused Research Group
grant 0244660, and the American Institute of Mathematics; the second author by OTKA grants
No. T38396, T43623, T49693 and the Balaton program; the third author by TÜBİTAK.
1

2

D. A. GOLDSTON, J. PINTZ, AND C. Y. YILDIRIM

The Bombieri-Vinogradov theorem states that for any A > 0 there is a B = B(A)
1
such that, for Q = N 2 (log N )−B ,
X
N
N
(1.3)
max
θ(N ; q, a) −
≪
.
a
φ(q)
(log N )A
(a,q)=1
q≤Q

We say that the primes have level of distribution ϑ if (1.3) holds for any A > 0 and
any ε > 0 with
Q = N ϑ−ε .

(1.4)

Elliott and Halberstam [5] conjectured that the primes have level of distribution
1. According to the Bombieri-Vinogradov theorem, the primes are known to have
level of distribution 1/2.
Let n be a natural number and consider the k-tuple
(1.5)

(n + h1 , n + h2 , . . . , n + hk ),

where H = {h1 , h2 , . . . , hk } is a set composed of distinct non-negative integers. If
every component of the tuple is a prime we call this a prime tuple. Letting n range
over the natural numbers, we wish to see how often (1.5) is a prime tuple. For
instance, consider H = {0, 1} and the tuple (n, n + 1). If n = 2, we have the prime
tuple (2 , 3). Notice that this is the only prime tuple of this form because, for n > 2,
one of the numbers n or n + 1 is an even number bigger than 2. On the other hand,
if H = {0, 2}, then we expect that there are infinitely many prime tuples of the
form (n, n + 2). This is the twin prime conjecture. In general, the tuple (1.5) can
be a prime tuple for more than one n only if for every prime p the hi ’s never occupy
all of the residue classes modulo p. This is immediately true for all primes p > k,
so to test this condition we need only to examine small primes. If we denote by
νp (H) the number of distinct residue classes modulo p occupied by the integers hi ,
then we can avoid p dividing some component of (1.5) for every n by requiring
(1.6)

νp (H) < p for all primes p.

If this condition holds we say that H is admissible and we call the tuple (1.5)
corresponding to this H an admissible tuple. It is a long-standing conjecture that
admissible tuples will infinitely often be prime tuples. Our first result is a step
towards confirming this conjecture.
Theorem 1. Suppose the primes have level of distribution ϑ > 1/2. Then there
exists an explicitly calculable constant C(ϑ) depending only on ϑ such that any
admissible k-tuple with k ≥ C(ϑ) contains at least two primes infinitely often.
Specifically, if ϑ ≥ 0.971, then this is true for k ≥ 6.
Since the 6-tuple (n, n + 4, n + 6, n + 10, n + 12, n + 16) is admissible, the ElliottHalberstam conjecture implies that
(1.7)

lim inf (pn+1 − pn ) ≤ 16,
n→∞

where the notation pn is used to denote the n-th prime. This means that pn+1 −pn ≤
16 for infinitely many n. Unconditionally, we prove a long-standing conjecture
concerning gaps between consecutive primes.
Theorem 2. We have
(1.8)

E1 := lim inf
n→∞

pn+1 − pn
= 0.
log pn

PRIMES IN TUPLES I

3

There is a long history of results on this topic which we will briefly mention.
The inequality E1 ≤ 1 is a trivial consequence of the prime number theorem. The
first result of type E1 < 1 was proved in 1926 by Hardy and Littlewood [17],
who on assuming the Generalized Riemann Hypothesis (GRH) obtained E1 ≤ 2/3.
This result was improved by Rankin [25] to E1 ≤ 3/5, also assuming the GRH.
The first unconditional estimate was proved by Erdős [7] in 1940. Using Brun’s
sieve, he showed that E1 < 1 − c with an unspecified positive explicitly calculable
constant c. His estimate was improved by Ricci [26] in 1954 to E1 ≤ 15/16. In
1965 Bombieri and Davenport [2] refined and made unconditional the method of
Hardy and Littlewood by substituting the Bombieri–Vinogradov theorem for the
GRH, and obtained E1 ≤ 1/2. They also combined their method with the method
of Erdős and obtained E1 ≤ 0.4665 . . . . Their result was further refined by Pilt’ai
[24] to E1 ≤ 0.4571 . . . , Uchiyama [31] to E1 ≤ 0.4542 . . . and in several steps by
Huxley [19] [20] to yield E1 ≤ 0.4425 . . . , and finally in 1984 to E1 ≤ .4393 . . .
[21]. In 1988 Maier [22] used his matrix-method to improve Huxley’s result to
E1 ≤ e−γ · 0.4425 · · · = 0.2484 . . . , where γ is Euler’s constant. Maier’s method by
itself gives E1 ≤ e−γ = 0.5614 . . . . The recent version of the method of Goldston
and Yıldırım [12] led, without combination with other methods, to E1 ≤ 1/4.
In a later paper in this series we will prove the quantitative result that
(1.9)

lim inf
n→∞

pn+1 − pn
1

(log pn ) 2 (log log pn )2

< ∞.

While Theorem 1 is a striking new result, it also reflects the limitations of our
current method. Whether these limitations are real or can be overcome is a critical
issue for further investigation. We highlight the following four questions.
Question 1. Can it be proved unconditionally by the current method that there
are infinitely often bounded gaps between primes? Theorem 1 would appear to be
within a hair’s breadth of obtaining this result. However, any improvement in the
level of distribution ϑ beyond 1/2 probably lies very deep, and even the GRH does
not help. Still, there are stronger versions of the Bombieri-Vinogradov theorem, as
found in [3], and the circle of ideas used to prove these results, which may help to
obtain this result.
Question 2. Is ϑ = 1/2 a true barrier for obtaining primes in tuples? Soundararajan [29] has demonstrated this is the case for the current argument, but perhaps
more efficient arguments may be devised.
Question 3. Assuming the Elliott-Halberstam conjecture, can it be proved that
there are three or more primes in admissible k-tuples with large enough k? Even
under the strongest assumptions, our method fails to prove anything about more
than two primes in a given tuple.
Question 4. Assuming the Elliott-Halberstam conjecture, can the twin prime
conjecture be proved using the current approximations?
The limitation of our method, identified in Question 3, is the reason we are less
successful in finding more than two primes close together. However, we are able to
improve on earlier results, in particular the recent results in [12]. For r ≥ 1, let
(1.10)

Er = lim inf
n→∞

pn+r − pn
.
log pn

4

D. A. GOLDSTON, J. PINTZ, AND C. Y. YILDIRIM

Bombieri and Davenport [2] showed Er ≤ r − 1/2. This bound was later improved
by Huxley
√ [19, 20] to Er ≤ r − 5/8 + o(1/r), by Goldston and Yıldırım [12] to
Er ≤ ( r − 1/2)2 , and by Maier [22] to Er ≤ e−γ (r − 5/8 + o (1/r)). In proving
Theorem 2 we will also show, assuming the primes have level of distribution ϑ,
(1.11)

Er ≤ max(r − 2ϑ, 0),

and hence unconditionally Er ≤ r − 1. However, by a more complicated argument,
we will prove the following result.
Theorem 3. Suppose the primes have level of distribution ϑ. Then for r ≥ 2,
√
√
(1.12)
Er ≤ ( r − 2ϑ)2 .
In particular, we have unconditionally, for r ≥ 1,
√
(1.13)
Er ≤ ( r − 1)2 .
From (1.11) or (1.12) we see that the Elliott-Halberstam conjecture implies that
(1.14)

E2 = lim inf
n→∞

pn+2 − pn
= 0.
log pn

We note that if we couple the ideas of the present work with Maier’s matrix method
[22] we then expect that (1.12) can be replaced by the stronger inequality
√
√
(1.15)
Er ≤ e−γ ( r − 2ϑ)2 .
While this paper is our first paper on this subject, we have two other papers
which overlaps it. The first paper [14], written jointly with Motohashi, gives a
short and simplified proof of Theorems 1 and 2. The second paper [13], written
jointly with Graham, uses sieve methods to prove Theorems 1 and 2 and provides
applications for tuples of almost-primes (products of two or more distinct primes.)
The present paper is organized as follows. In Section 2, we describe our method
and its relation to earlier work. We also state Propositions 1 and 2 which incorporate the key new ideas in this paper. These are developed in a more general form
than in [13] or [14] so as to be employable in many applications. In Section 3, we
prove Theorems 1 and 2 using these propositions. The method of proof is due to
Granville and Soundararajan. In Section 4 we make some further comments on
the method used in Section 3. In Section 5 we prove two lemmas needed later. In
Section 6, we prove a special case of Proposition 1 which illustrates the key points
in the general case. In Section 7 we begin the proof of Proposition 1 which is
reduced to evaluating a certain contour integral. In Section 8 we evaluate a more
general contour integral that occurs in the proof of both propositions. In Section
9, we prove Proposition 2. In this paper we do not obtain results that are uniform
in k, and therefore we assume here that our tuples have a fixed length. However,
uniform results are needed for (1.9), and they will be the topic of the next paper
in this series. Finally, we prove Theorem 3 in Section 10.
Notation. In the following c and C will denote (sufficiently) small and (sufficiently) large absolute positive constants, respectively, which have been chosen
appropriately. This is also true for constants formed from c or C with subscripts or
accents. We unconventionally will allow these constants to be different at different
occurences. Constants implied by pure o, O, ≪ symbols will be absolute, unless
otherwise stated. [S] is 1 if the statement S is true and is 0 if S is false. The

PRIMES IN TUPLES I

5

P
P
symbol ♭ indicates the summation is over squarefree integers, and ′ indicates
the summation variables are pairwise relatively prime.
The ideas used in this paper have developed over many years. We are indebted
to many people, not all of whom we can mention. However, we would like to thank
A. Balog, E. Bombieri, T. H. Chan, J. B. Conrey, P. Deift, D. Farmer, K. Ford,
J. Friedlander, A. Granville, C. Hughes, D. R. Heath-Brown, A. Ledoan, H. L.
Montgomery, Sz. Gy. Revesz, P. Sarnak, and K. Soundararajan.
2. Approximating prime tuples
Let
(2.1)

H = {h1 , h2 , . . . , hk } with 1 ≤ h1 , h2 , . . . , hk ≤ h distinct integers,

and let νp (H) denote the number of distinct residue classes modulo p occupied by
the elements of H.1 For squarefree integers d, we extend this definition to νd (H)
by multiplicativity. We denote by
−k 

Y
νp (H)
1
1−
(2.2)
S(H) :=
1−
p
p
p

the singular series associated with H. Since νp (H) = k for p > h, we see that the
product is absolutely convergent and therefore H is admissible as defined in (1.6)
if and only if S(H) 6= 0. Hardy and Littlewood conjectured an asymptotic formula
for the number of prime tuples (n + h1 , n + h2 , . . . , n + hk ), with 1 ≤ n ≤ N , as
N → ∞. Let Λ(n) denote the von Mangoldt function which equals log p if n = pm ,
m ≥ 1, and zero otherwise. We define

(2.3)

Λ(n; H) := Λ(n + h1 )Λ(n + h2 ) · · · Λ(n + hk )

and use this function to detect prime tuples and tuples with prime powers in components, the latter of which can be removed in applications. The Hardy–Littlewood
prime-tuple conjecture [16] can be stated in the form
X
Λ(n; H) = N (S(H) + o(1)), as N → ∞.
(2.4)
n≤N

(This conjecture is trivially true if H is not admissible.) Except for the prime
number theorem (1-tuples), this conjecture is unproved.2
The program the first and third authors have been working on since 1999 is to
compute approximations for (2.3) using short divisor sums and apply the results to
problems on primes. The simplest approximation of Λ(n) is based on the elementary
formula
X
n
µ(d) log ,
(2.5)
Λ(n) =
d
d|n

which can be approximated with the smoothly truncated divisor sum
X
R
µ(d) log .
(2.6)
ΛR (n) =
d
d|n
d≤R

1The restriction of the set H to positive integers is only for simplicity, and, if desired, can
easily be removed later from all of our results.
2Asymptotic results for the number of primes in tuples, unlike the existence result in Theorem
1, are beyond the reach of our method.

6

D. A. GOLDSTON, J. PINTZ, AND C. Y. YILDIRIM

Thus, an approximation for Λ(n; H) is given by
(2.7)

ΛR (n + h1 )ΛR (n + h2 ) · · · ΛR (n + hk ).

In [12], Goldston and Yıldırım applied (2.7) to detect small gaps between primes
and proved


1
pn+1 − pn
≤ .
E1 = lim inf
n→∞
log pn
4
In the process of that work, they realized that for some applications there might
be much better approximations for prime tuples than (2.7), but the approximation
they devised was unsuccessful. Recently, the current authors were able to obtain
such an approximation, which is applied here to the problem of small gaps between
primes.
The idea for our new approximation came from a paper of Heath-Brown [18]
on almost prime tuples. His result is itself a generalization of Selberg’s proof from
1951 (see [28], p. 233–245) that the polynomial n(n + 2) will infinitely often have
at most five distinct prime factors, so that the same is true for the tuple (n, n + 2).
Not only does our approximation have its origin in these papers, but in hindsight
the argument of Granville and Soundararajan (employed in the proof of Theorems
1 and 2) is essentially the same as the method used in these papers.
In connection with the tuple (1.5), we consider the polynomial
(2.8)

PH (n) = (n + h1 )(n + h2 ) · · · (n + hk ).

If the tuple (1.5) is a prime tuple then PH (n) has exactly k prime factors. We
detect this condition by using the k-th generalized von Mangoldt function
k

X
n
,
µ(d) log
(2.9)
Λk (n) =
d
d|n

which vanishes if n has more than k distinct prime factors.3 With this, our prime
tuple detecting function becomes
1
(2.10)
Λk (n; H) := Λk (PH (n)).
k!
The normalization factor 1/k! simplifies the statement of our results. As we will see
in Section 5, this approximation suggests the Hardy–Littlewood type conjecture
X
(2.11)
Λk (n; H) = N (S(H) + o(1)) .
n≤N

This is a special case of the general conjecture of Bateman–Horn [1] which is the
quantitative form of Schinzel’s conjecture [27]. There is not much difference between
(2.4) and (2.11), but the same is not true of their approximations.
In analogy with (2.6) (when k = 1), we approximate Λk by the smoothed and
truncated divisor sum
k

X
R
µ(d) log
d
d|n
d≤R

3As with Λ(n), we overcount the prime tuples by including factors which are proper prime
powers, but these can be removed in applications with a negligible error. The slightly misleading
notational conflict between the generalized von Mangoldt function Λk and ΛR will only occur in
this section.

PRIMES IN TUPLES I

7

and define
(2.12)

ΛR (n; H) =

1
k!

X

d|PH (n)
d≤R

k

R
.
µ(d) log
d

However, as we will see in the next section, this approximation is not adequate to
prove Theorems 1 and 2.
A second simple but crucial idea is needed: rather than only approximate prime
tuples, one should approximate tuples with primes in many components. Thus, we
consider when PH (n) has k + ℓ or less distinct prime factors, where 0 ≤ ℓ ≤ k, and
define
(2.13)

ΛR (n; H, ℓ) =

1
(k + ℓ)!

X

d|PH (n)
d≤R


k+ℓ
R
µ(d) log
,
d

where |H| = k. In Section 4, we will give precisely a measure of how well a function
detects primes in tuples, so that in terms of this measure, when k, ℓ → ∞ and
ℓ = o(k), this approximation is twice as good at detecting primes in tuples as (2.12)
(that is, when ℓ = 0), which in turn is twice as good as (2.7). This improvement
enables us to prove Theorem 2 unconditionally. Moreover, it allows the level of
distribution needed in Theorem 1 to be any number > 1/2.
The advantage of (2.12) and (2.13) over (2.7) can be seen as follows. If in
(2.12) or (2.13) we restrict ourselves to d’s with all prime factors larger than h,
then the condition d|PH (n) implies that we can write d = d1 d2 · · · dk uniquely with
di |n + hi , 1 ≤ i ≤ k, the di ’s pairwise relatively prime, and d1 d2 · · · dk ≤ R. In
1
our application to prime gaps we require that R ≤ N 4 −ε . On the other hand, on
expanding, (2.7) becomes a sum over di |n + hi , 1 ≤ i ≤ k, with d1 ≤ R, d2 ≤ R,
1
. . . , dk ≤ R. The application to prime gaps here requires that Rk ≤ N 4 −ε , and
1
ε
−
so R ≤ N 4k k . Thus (2.7) has a more severe restriction on the range of the
divisors. An additional technical advantage is that having one truncation rather
than k truncations simplifies our calculations.
Our main results on ΛR (n; H, ℓ) are summarized in the following two propositions. Suppose H1 and H2 are, respectively, sets of k1 and k2 distinct non-negative
integers ≤ h. We always assume that at least one of these sets is non-empty. Let
M = k1 + k2 + ℓ1 + ℓ2 .
Proposition 1. Let H = H1 ∪ H2 , |Hi | = ki , and r = |H1 ∩ H2 |. If R ≪
1
N 2 (log N )−4M and h ≤ RC for any given constant C > 0, then as R, N → ∞ we
have
(2.14)


X
ℓ1 + ℓ2 (log R)r+ℓ1 +ℓ2
ΛR (n; H1 , ℓ1 )ΛR (n; H2 , ℓ2 ) =
(S(H) + oM (1))N.
ℓ1
(r + ℓ1 + ℓ2 )!
n≤N

Proposition 2. Let H = H1 ∪ H2 , |Hi | = ki , r = |H1 ∩ H2 |, 1 ≤ h0 ≤ h, and
1
H0 = H ∪ {h0 }. If R ≪M N 4 (log N )−B(M) for a sufficiently large positive constant

8

D. A. GOLDSTON, J. PINTZ, AND C. Y. YILDIRIM

B(M ), and h ≤ R, then as R, N → ∞ we have
(2.15)
X
ΛR (n; H1 , ℓ1 )ΛR (n; H2 , ℓ2 )θ(n + h0 )
n≤N



ℓ1 + ℓ2 (log R)r+ℓ1 +ℓ2


(S(H0 ) + oM (1))N


(r
+ ℓ1 + ℓ2 )!
ℓ1





ℓ1 +ℓ2 +1 (log R)r+ℓ1 +ℓ2 +1
=
(S(H)+oM (1))N

ℓ1 + 1  (r+ℓ1 +ℓ2 +1)!



r+ℓ
+ℓ
+1

ℓ +ℓ +2 (log R) 1 2


 1 2
(S(H)+oM (1))N
ℓ1 + 1
(r+ℓ1 +ℓ2 +1)!

if h0 6∈ H,
if h0 ∈ H1 and h0 6∈ H2 ,
if h0 ∈ H1 ∩ H2 .

Assuming the primes have level of distribution ϑ > 1/2, i.e., (1.3) with (1.4) holds,
ϑ
we may choose, for any ε > 0, R ≪M N 2 −ε and h ≤ Rε .
Remark. By relabeling the variables we obtain the corresponding form if h0 ∈
H2 , h0 6∈ H1 .
Propositions 1 and 2 can be strengthened in several ways. We will show that
the error terms oM (1) can be replaced by a series of lower order terms and a prime
number theorem type of error term. Moreover, we can make the result uniform for
M → ∞ as an explicit function of N and R. This will be proved in a later paper
and used in the proof of (1.9).

3. Proofs of Theorems 1 and 2
In this section we employ Propositions 1 and 2 and a simple argument due to
Granville and Soundararajan to prove Theorems 1 and 2.
For ℓ ≥ 0, Hk = {h1 , h2 , . . . , hk }, 1 ≤ h1 , h2 , . . . , hk ≤ h ≤ R, we deduce from
1
Proposition 1, for R ≪ N 2 (log N )−B(M) and R, N → ∞, that
X

(3.1)

n≤N

ΛR (n; Hk , ℓ)2 ∼

 
2ℓ
1
S(Hk )N (log R)k+2ℓ .
(k + 2ℓ)! ℓ
ϑ

For any hi ∈ Hk , we have from Proposition 2, for R ≪ N 2 −ε , and R, N → ∞,
(3.2)

X

n≤N

ΛR (n; Hk , ℓ)2 θ(n + hi ) ∼



2
2ℓ + 1
S(Hk )N (log R)k+2ℓ+1 .
(k + 2ℓ + 1)!
ℓ

PRIMES IN TUPLES I

9

ϑ

Taking R = N 2 −ε , we obtain4
(3.3)
S:=

2N
X

n=N +1

k
X
i=1

θ(n + hi ) − log 3N

!

ΛR (n; Hk , ℓ)2



2
2ℓ + 1
∼k
S(Hk )N (log R)k+2ℓ+1
(k + 2ℓ + 1)!
ℓ
 
1
2ℓ
− log 3N
S(Hk )N (log R)k+2ℓ
(k + 2ℓ)! ℓ
 


2ℓ + 1
2ℓ
1
2k
S(Hk )N (log R)k+2ℓ .
log R − log 3N
∼
k + 2ℓ + 1 ℓ + 1
(k + 2ℓ)! ℓ
Here we note that the tuple Hk will contain at least two primes if S > 0. This
situation occurs when
(3.4)

k
2ℓ + 1
ϑ > 1.
k + 2ℓ + 1 ℓ + 1

If k, ℓ → ∞ with ℓ = o(k), then the left-hand side has the limit 2ϑ, and thus (3.4)
holds for any ϑ > 1/2 if we choose k and ℓ appropriately depending on ϑ. This
proves the first part of Theorem 1. Next, assuming ϑ > 20/21, we see that (3.4)
holds with ℓ = 1 and k = 7. This proves the second part of Theorem 1 but with
k = 7. The case k = 6 requires a slightly more complicated argument and is treated
later in this section.
The table below gives the values of C(ϑ), defined in Theorem 1, obtained from
(3.4). For a given ϑ, it gives the smallest k and corresponding smallest ℓ for which
(3.4) is true. Here h(k) is the shortest length of any admissible k-tuple, which has
been computed by Engelsma [6] by exhaustive search for 1 ≤ k ≤ 305 and covers
every value in this table and the next except h(421), where we have taken the upper
bound value from [6].
ϑ
1
.95
.90
.85
.80
.75
.70
.65
.60
.55

k
7
8
9
11
16
21
31
51
111
421

ℓ
1
1
1
1
1
2
2
3
5
10

h(k)
20
26
30
36
60
84
140
252
634
2956∗

* indicates this value could be an upper bound of the true value.
4In (3.3), as well as later in (3.8), the asymptotic sign replaces an error term of size o(log N ) in
the parenthesis term after log 3N . We thus make the convention that the asymptotic relationship
holds only up to the size of the apparent main term.

10

D. A. GOLDSTON, J. PINTZ, AND C. Y. YILDIRIM

To prove Theorem 2, we modify the previous proof by considering

Se :=

(3.5)

2N
X

n=N +1




X

1≤h0 ≤h



θ(n + h0 ) − r log 3N 

X

1≤h1 ,h2 ,...,hk ≤h
distinct

ΛR (n; Hk , ℓ)2 ,

e we need the case of Proposition 2
where r is a positive integer. To evaluate S,
where h0 ∈
6 Hk ,
X

(3.6)

n≤N

 
2ℓ
1
S(Hk ∪ {h0 })N (log R)k+2ℓ .
ΛR (n; Hk , ℓ) θ(n + h0 ) ∼
(k + 2ℓ)! ℓ
2

We also need a result of Gallagher [9]: as h → ∞,
X

(3.7)

1≤h1 ,h2 ,...hk ≤h
distinct

S(Hk ) ∼ hk .

ϑ

Taking R = N 2 −ε , and applying (3.1), (3.2), (3.6), and (3.7), we find that
(3.8)
Se ∼

X

k

1≤h1 ,h2 ,...,hk ≤h
distinct

+

X

1≤h0 ≤h
h0 6=hi ,1≤i≤k



2ℓ + 1
2
S(Hk )N (log R)k+2ℓ+1
(k + 2ℓ + 1)!
ℓ

 
2ℓ
1
S(Hk ∪ {h0 })N (log R)k+2ℓ
(k + 2ℓ)! ℓ

!
 
1
2ℓ
− r log 3N
S(Hk )N (log R)k+2ℓ
(k + 2ℓ)! ℓ

 

2k
2ℓ + 1
2ℓ
1
∼
N hk (log R)k+2ℓ .
log R + h − r log 3N
k + 2ℓ + 1 ℓ + 1
(k + 2ℓ)! ℓ
Thus, there are at least r + 1 primes in some interval (n, n + h], N < n ≤ 2N ,
provided that
(3.9)

h>


r−



2ℓ + 1 ϑ
2k
−ε
log N,
k + 2ℓ + 1 ℓ + 1 2

√
which, on letting ℓ = [ k/2] and taking k sufficiently large, gives
(3.10)



1
h > r − 2ϑ + 4ε + O( √ ) log N.
k

This proves (1.11). Theorem 2 is the special case r = 1 and ϑ = 1/2.

PRIMES IN TUPLES I

11

We are now ready to prove the last part of Theorem 1. Consider
(3.11)
′

S :=
=

2N
X

n=N +1
2N
X

n=N +1

=

X

k
X
i=1

k
X
i=1

0≤ℓ1 ,ℓ2 ≤L

θ(n + hi ) − log 3N
θ(n + hi ) − log 3N

!

!

L
X
ℓ=0

!2

aℓ ΛR (n; Hk , ℓ)

X

0≤ℓ1 ,ℓ2 ≤L

aℓ1 aℓ2 ΛR (n; Hk , ℓ1 )ΛR (n; Hk , ℓ2 )

aℓ1 aℓ2 Mℓ1 ,ℓ2 ,

where
fℓ1 ,ℓ2 − (log 3N )Mℓ1 ,ℓ2 ,
Mℓ1 ,ℓ2 = M

(3.12)

ϑ

say. Applying Propositions 1 and 2 with R = N 2 −ε , we deduce that


ℓ1 + ℓ2 (log R)k+ℓ1 +ℓ2
S(Hk )N
Mℓ1 ,ℓ2 ∼
(k + ℓ1 + ℓ2 )!
ℓ1

and

Therefore,
Mℓ1 ,ℓ2



k+ℓ1 +ℓ2 +1
fℓ1 ,ℓ2 ∼ k ℓ1 + ℓ2 + 2 (log R)
M
S(Hk )N.
ℓ1 + 1
(k + ℓ1 + ℓ2 + 1)!



(log R)k+ℓ1 +ℓ2
ℓ1 + ℓ2
S(Hk )N
∼
(k + ℓ1 + ℓ2 )!
ℓ1


k(ℓ1 + ℓ2 + 2)(ℓ1 + ℓ2 + 1)
×
log R − log 3N .
(ℓ1 + 1)(ℓ2 + 1)(k + ℓ1 + ℓ2 + 1)

Defining bℓ = (log R)ℓ aℓ and b to be the column matrix corresponding to the vector
(b0 , b1 , . . . , bL ), we obtain
(3.13)
1
S′
S ∗ (N, Hk , ϑ, b) :=
S(Hk )N (log R)k+1




X
k(ℓ1 + ℓ2 + 2)(ℓ1 + ℓ2 + 1)
2
1
ℓ1 + ℓ2
−
∼
b ℓ1 b ℓ2
(k + ℓ1 + ℓ2 )! (ℓ1 + 1)(ℓ2 + 1)(k + ℓ1 + ℓ2 + 1) ϑ
ℓ1
0≤ℓ1 ,ℓ2 ≤L

∼ bT M b,

where





i+j
1
k(i + j + 2)(i + j + 1)
2
(3.14) M =
.
−
i
(k + i + j)! (i + 1)(j + 1)(k + i + j + 1) ϑ 0≤i,j≤L
We need to choose b so that S ∗ > 0 for a given ϑ and minimal k. On taking b to
be an eigenvector of the matrix M with eigenvalue λ, we see that
(3.15)

S ∗ ∼ bT λb = λ

k
X
i=0

|bi |2

will be > 0 provided that λ is positive. Therefore S ∗ > 0 if M has a positive
eigenvalue and b is chosen to be the corresponding eigenvector. Using Mathematica

12

D. A. GOLDSTON, J. PINTZ, AND C. Y. YILDIRIM

we computed the values of C(ϑ) indicated in the following table, which may be
compared to the earlier table obtained from (3.4).
ϑ
1
.95
.90
.85
.80
.75
.70
.65
.60
.55

k
6
7
8
10
12
16
22
35
65
193

L
1
1
2
2
2
2
4
4
6
9

h(k)
16
20
26
32
42
60
90
158
336
1204

In particular, taking k = 6, L = 1, b0 = 1, and b1 = b in (3.13), we get





4
112
16
1
96 −
+ b2 4 −
+ 2b 18 −
S∗ ∼
8!
ϑ
ϑ
ϑ


4(1 − ϑ) 2
18ϑ − 16 96ϑ − 112
∼−
b − 2b
−
8!ϑ
4(1 − ϑ)
4(1 − ϑ)

2

18ϑ − 16
15ϑ2 − 64ϑ + 48
4(1 − ϑ)
b−
.
+
∼−
8!ϑ
4(1 − ϑ)
4(1 − ϑ)2
Choosing b =

18ϑ−16
4(1−ϑ) ,

we then have
S∗ ∼ −

15ϑ2 − 64ϑ + 48
,
8!ϑ(1 − ϑ)

of which the right-hand side is >√0 if ϑ ≤ 1 lies between the two roots of the
quadratic; this occurs when 4(8 − 19)/15 < ϑ ≤ 1. Thus, there are at least two
primes in any admissible tuple Hk for k = 6, if
√
4(8 − 19)
= 0.97096 . . . .
(3.16)
ϑ>
15
This completes the proof of Theorem 1.
4. Further Remarks on Section 3
We can formulate the method of Section 3 as follows. For a given tuple H =
{h1 , h2 , . . . , hk } we define
!
2N
2N
k
X
X
X
θ(n + hi ) fR (n; H)2 ,
(4.1)
Q1 :=
fR (n; H)2 , Q2 :=
n=N +1

n=N +1

i=1

where f should be chosen to make Q2 large compared with Q1 , and R = R(N ) will
be chosen later. It is reasonable to assume
X
λd,R .
(4.2)
fR (n; H) =
d|PH (n)
d≤R

PRIMES IN TUPLES I

13

Our goal is to select the λd,R which maximizes

 
Q2
1
.
(4.3)
ρ = ρ(N ; H, f ) :=
log 3N Q1
If ρ > r for some N and positive integer r, then there exists an n, N < n ≤ 2N ,
such that the tuple (1.5) has at least r + 1 prime components.
This method is exactly the same as that introduced for twin primes by Selberg
and for general tuples by Heath-Brown. However, they used the divisor function
d(n + hi ) in Q2 in place of θ(n + hi ). Heath-Brown even chose f = ΛR (n; H, 1).
As a first example, suppose we choose f as in (2.6) and (2.7), so that
(4.4)

fR (n; H) =

By [12], we have, as R, N → ∞,
(4.5)

5

k
Y

ΛR (n + hi ).

i=1

Q1 ∼ N S(H)(log R)k

Q2 ∼ kN S(H)(log R)k+1

ϑ0

1

if R ≤ N 2k (1−ε) ,
ϑ

if R ≤ N 2k (1−ε) .

On taking R = N 2k , 0 < ϑ0 < ϑ, we see that, as N → ∞,
ϑ0
log R
∼
.
(4.6)
ρ∼k
log N
2
Notice that ρ < 1, so we fail to detect primes in tuples. In Section 3, we proved
that on choosing f = ΛR (n; H, ℓ), by (3.1) and (3.2), as N → ∞,
k
2ℓ + 1
(4.7)
ρ∼
ϑ0 .
k + 2ℓ + 1 ℓ + 1
k
If ℓ = 0 this gives ρ ∼ k+1
ϑ0 , which, for large k, is twice as large as (4.6), while
(4.7) gains another factor of two when ℓ → ∞ slowly as k → ∞. This finally shows
ρ > 1 if ϑ > 1/2, but just fails if ϑ = 1/2.
In (3.11) we chose


L
X
X
bℓ
log(R/d)
(4.8)
fR (n; H) =
µ(d)P
ΛR (n; Hk , ℓ) =
(log R)ℓ
log R
ℓ=0

d|PH (n)
d≤R

where P is a polynomial with a k-th order zero at 0. The matrix procedure does not
provide a method for analyzing ρ unless L is taken fixed, but the general problem
has been solved by Soundararajan [29]. In particular, he showed that ρ < 1 if
ϑ = 1/2, so that one can not prove there are bounded gaps between primes using
(4.8). The exact solution from Soundrarajan’s analysis was obtained by a calculus
of variations argument by Conrey, which gives, as N → ∞,
k(k − 1)
(4.9)
ρ=
ϑ0 ,
2β
where β is determined as the solution of the equation
R 1 k−2
p
p
y
q(y)2 dy
k
1
= R 01
with q(y) = Jk−2 (2 β) − y 1− 2 Jk−2 (2 βy)
(4.10)
β
y k−1 q ′ (y)2 dy
0
ϑ

5For special reasons, the validity of the formula for Q actually holds here for R ≤ N 2(k−1) (1−ε)
2

if k ≥ 2, but this is insignificant for the present discussion.

14

D. A. GOLDSTON, J. PINTZ, AND C. Y. YILDIRIM

where Jk is the Bessel function of the first type. Using Mathematica, one can check
that this gives exactly the values of k in the previous table, which is in agreement
with our earlier calculations; but it provides somewhat smaller values of ϑ for which
a given k-tuple will contain two primes. Thus, for example, we can replace (3.16)
by the result that every admissible 6-tuple will contain at least two primes if
(4.11)

ϑ > .95971 . . . .
5. Two Lemmas

In this section we will prove two lemmas needed for the proof of Propositions 1
and 2. The conditions on these lemmas have been constructed in order for them to
hold uniformly in the given variables.
The Riemann zeta-function has the Euler product representation, with s = σ+it,

(5.1)

ζ(s) =

−1
Y
1
,
1− s
p
p

σ > 1.

The zeta-function is analytic except for a simple pole at s = 1, where as s → 1
(5.2)

ζ(s) =

1
+ γ + O(|s − 1|).
s−1

(Here γ is Euler’s constant.) We need standard information concerning the classical
zero-free region of the Riemann zeta-function. By Theorem 3.11 and (3.11.8) in
[30], there exists a small constant c > 0, for which we assume c ≤ 10−2 , such that
ζ(σ + it) 6= 0 in the region
(5.3)

σ ≥1−

4c
log(|t| + 3)

for all t. Furthermore, we have

(5.4)

ζ(σ + it) −

1
1
≪ log(|t| + 3),
≪ log(|t| + 3),
σ − 1 + it
ζ(σ + it)
1
ζ′
(σ + it) +
≪ log(|t| + 3),
ζ
σ − 1 + it

in this region. We will fix this c for the rest of the paper (we could take, for instance,
c = 10−2 , see [8]). Let L denote the contour given by
(5.5)

s=−

c
+ it.
log(|t| + 3)

Lemma 1. We have, for R ≥ C, k ≥ 2, B ≤ Ck,
Z
√
Rs
(log(|s| + 3))B k ds ≪ C1k R−c2 + e− c log R/2 ,
(5.6)
s
L
where C1 , c2 and the implied constant in ≪ depends only on the constant C in the
formulation of the lemma. In addition, if k ≤ c3 log R with a sufficiently small c3
depending only on C, then
Z
√
Rs
(5.7)
(log(|s| + 3))B k ds ≪ e− c log R/2 .
s
L

PRIMES IN TUPLES I

15

Proof. The left-hand side of (5.6) is, with C4 depending on C,
Z ∞
(log(|t| + 4))B
dt
≪
Rσ(t)
(|t| + c/2)k
0
c
Z ω−3 − log(|t|+3)
Z ∞
Z C4
R
(5.8)
C1k R−c2 dt +
dt
+
t−3/2 dt
≪
t3/2
C4
0
ω−3
≪ C1k R−c2 + e−

c log R
log ω

1

+ ω− 2 ,

√
where now C1 is a constant depending on C. On choosing log ω = c log R, the
first part of the lemma follows. The second part is an immediate consequence of
the first part.
The next lemma provides some explicit estimates for sums of the generalized
divisor function. Let ω(q) denote the number of prime factors of a squarefree
integer q. For any real number m, we define
dm (q) = mω(q) .

(5.9)

This agrees with the usual definition of the divisor functions when m is a positive
integer. Clearly, dm (q) is a monotonically increasing function of m (for a fixed q),
and for real m1 , m2 , and y, we see that
dm1 (q)dm2 (q) = dm1 m2 (q),

(5.10)

(dm (q))y = dmy (q).

P♭
Recall
indicates a sum over squarefree integers. We use the ceiling function
⌈y⌉ := min{n ∈ Z; y ≤ n}.
Lemma 2. We have, for any positive real m and x ≥ 1
X♭ dm (q)
≤ (⌈m⌉ + log x)⌈m⌉ ≤ (m + 1 + log x)m+1
(5.11)
D′ (x, m) :=
q
q≤x

and
(5.12)

D∗ (x, m) :=

X♭

q≤x

dm (q) ≤ x(⌈m⌉ + log x)⌈m⌉ ≤ x(m + 1 + log x)m+1 .

′

For ν ≥ max(c log(K + 1), 1), there is an absolute constant C ′ depending on c′
such that, for x ≥ 1 and K ≥ 1, we have
(5.13)

X♭ (d3K (q))1+ ν1
′
≤ (C ′ K + log x)C K .
q

q≤x

Proof. First, we treat the case when m is a positive integer. We prove (5.11) by
induction. Observe the assertion is true for m = 1, that is, when d1 (q) = 1 by
definition. Suppose (5.11) is proved for m − 1. Let us denote the smallest term in a
given product representation of q by j = j(q) ≤ x1/m . Then this factor can stand
at m places, and, therefore, with q = q ′ j(q) = q ′ j, we have
1/m

xX
X♭ dm (q)
≤m
q
j=1

q≤x

♭

1
1 X♭ dm−1 (q ′ )
m−1
≤ m(1 + log x m ) (m − 1 + log x)
′
j′
q

q ≤x/j

≤ (m + log x)(m + log x)m−1 = (m + log x)m .

16

D. A. GOLDSTON, J. PINTZ, AND C. Y. YILDIRIM

This completes the induction. For real m, the result holds since D′ (x, m) ≤
D′ (x, ⌈m⌉). Next, we note that (5.12) follows from (5.11) because D∗ (x, m) ≤
′
1
xD′ (x, m). To prove (5.13), let r := (3K)1+ ν ≤ 9e1/c K. By (5.9), we have
(d3K (q))

1+ ν1

= dr (q),
′

′

and the result follows by (5.11) with C = 9e1/c + 1.
6. A special case of Proposition 1
In this section we will prove a special case of Proposition 1 which illustrates the
method without involving the technical complications that appear in the general
case. This allows us to set up some notation and obtain estimates for use in the
general case. We also obtain the result uniformly in k.
Assume H is non-empty (so that k ≥ 1), ℓ = 0, and ΛR (n; H, 0) = ΛR (n; H).
Proposition 1 (Special Case). Supposing
(6.1)

1

k ≪η0 (log R) 2 −η0 with an arbitrarily small fixed η0 > 0,

and h ≤ RC , with C any fixed positive number, we have
(6.2)

N
X

n=1

ΛR (n; H) = S(H)N + O(N e−c

√

log R

This result motivates the conjecture (2.11).


) + O R(2 log R)2k .

Proof. We have
(6.3)

SR (N ; H) :=

N
X

n=1

ΛR (n; H) =

k X

1 X
R
µ(d) log
1.
k!
d
d≤R

1≤n≤N
d|PH (n)

If for a prime p we have p|PH (n), then among the solutions n ≡ −hi (mod p),
1 ≤ i ≤ k, there will be νp (H) distinct solutions modulo p. For d squarefree we
then have by multiplicativity νd (H) distinct solutions for n modulo d which satisfy
d|PH (n), and for each solution one has n running through a residue class modulo
d. Hence we see that


X
N
+ O(1) .
(6.4)
1 = νd (H)
d
1≤n≤N
d|PH (n)

Trivially νq (H) ≤ k ω(q) = dk (q) for squarefree q. Therefore, we conclude that
(6.5)






k
k X♭
X µ(d)νd (H) 
R
1
(log
R)
+O
SR (N ; H) = N 
log
νd (H)
k!
d
d
k!
d≤R
d≤R

2k
= N TR (N ; H) + O R(k + log R) ,

by Lemma 2.
Let (a) denote the contour s = a + it, −∞ < t < ∞. We apply the formula

Z
xs
1
0
if 0 < x ≤ 1,
ds
=
(6.6)
1
k
(log
x)
if
x ≥ 1,
2πi
sk+1
k!
(c)

PRIMES IN TUPLES I

17

for c > 0, and have that
(6.7)

TR (N ; H) =

1
2πi

Z

F (s)

Rs
ds,
sk+1

(1)

where, letting s = σ + it and assuming σ > 0,
(6.8)

F (s) =

∞
X
µ(d)νd (H)
d=1

d1+s

=

Y
p

1−

νp (H) 
.
p1+s

Since νp (H) = k for all p > h, we write
(6.9)

F (s) =

GH (s)
,
ζ(1 + s)k

where by (5.1)

(6.10)


−k
Y
νp (H)
1
GH (s) =
1 − 1+s
1 − 1+s
p
p
p

 k 2 
Y
k − νp (H)
,
+ Oh 2+2σ
=
1+
p1+s
p
p

which is analytic and uniformly bounded for σ > −1/2 + δ for any δ > 0. Also, by
(2.2) we see that
(6.11)

GH (0) = S(H).

From (5.4) and (6.9), the function F (s) satisfies the bound
(6.12)

F (s) ≪ |GH (s)|(C log(|t| + 3))k .

in the region on and to right of L. Here GH (s) is analytic and bounded in this
region, and has a dependence on both k and the size h of the components of H.
We note that νp (H) = k not only when p > h, but whenever p 6 | ∆, where
Y
(6.13)
∆ :=
|hj − hi |,
1≤i<j≤k

since then all k of the hi ’s are distinct modulo p. We now introduce an important
parameter U that is used throughout the rest of the paper. We want U to be an
2
upper bound for log ∆, and since trivially ∆ ≤ hk we choose
(6.14)

U := Ck 2 log(2h)

and have
(6.15)

log ∆ ≤ U.

We now prove, for −1/4 < σ ≤ 1,
(6.16)

|GH (s)| ≪ exp(5kU δ log log U ),

where δ = max(−σ, 0).

18

D. A. GOLDSTON, J. PINTZ, AND C. Y. YILDIRIM

We treat separately the different pieces of the product defining GH . First, by use
of the inequality log(1 + x) ≤ x for x ≥ 0, we have


Y
Y
k
νp (H)
≤
1 + 1−δ
1 − 1+s
p
p
p≤U
p≤U
X

= exp
log 1 +
p≤U

≤ exp

X

k

p1−δ

p≤U


k 

p1−δ




X 1
δ
≤ exp kU
p
p≤U


≪ exp kU δ log log U .

Second, by the same estimates and the inequality (1−x)−1 ≤ 1+3x for 0 ≤ x ≤ 2/3,
we see that
Y
1−

p≤U

1
p1+s

−k

Y
≤
1−
p≤U

Y
≤
1+
p≤U

1
p1−δ
3
p1−δ

−1 k
 k


≪ exp 3kU δ log log U .



since

1
p1−δ

≤

1
23/4

<

2
,
3


Hence, the terms in the product for GH (s) with p ≤ U are ≪ exp 4kU δ log log U .
For the terms p > U , we first consider those for which p|∆. In absolute value,
they are
≤

Y
1+

p|∆
p>U


1+
1−δ
k

p

3
p1−δ

k

≤ exp

 X 4k 
.
p1−δ
p|∆
p>U

Since there are less than (1 + o(1)) log ∆ < U primes with p|∆, the sum above is
increased if we replace these terms with the integers between U and 2U . Therefore
the right-hand side above is

≤ exp 4k

X

U<n≤2U

1 

n1−δ


≤ exp 4k(2U )δ

X

U<n≤2U

Finally, if p > U
k
k
1
≤ 1−δ ≤ ,
p1+s
U
2

1
≤ exp(4kU δ ).
n

PRIMES IN TUPLES I

19

so that in absolute value the terms with p > U and p 6 | ∆ are

−k
Y
1
k
1 − 1+s
1 − 1+s
=
p
p
p6 |∆
p>U

= exp

X

p6 |∆
p>U

!
∞
∞
X
X
1  1 ν
1  k ν
+k
−
ν p1+s
ν p1+s
ν=1
ν=1

XX

∞
2  k ν
ν p1−δ
p>U ν=2

XX
∞ 
k ν
≤ exp
p1−δ
p>U ν=2

X 1 
≤ exp 2k 2
n2−2δ
n>U
 4k 2 U δ 
≤ exp
U 1−δ
≤ exp 2kU δ .
≤ exp


Thus, the terms with p > U contribute ≤ exp 6kU δ , from which we obtain (6.16).
In conclusion, for h ≪ RC (where C > 0 is fixed and as large as we wish) and
for s on or to the right of L, we have
(6.17)

F (s) ≪ (C log(|t| + 3))k exp(5kU δ log log U ).

Returning to the integral in (6.7), we see that the integrand vanishes as |t| → ∞,
−1/4 < σ ≤ 1. By (6.9) we see that in moving the contour from (1) to the left
to L we either pass through a simple pole at s = 0 when H is admissible (so that
S(H) 6= 0), or we pass through a regular point at s = 0 when H is not admissible.
In either case, we have by virtue of (5.2), (6.11), (6.14), (6.17), and Lemma 1, for
any k satisfying (6.1),
Z
1
Rs
TR (N ; H) = GH (0) +
F (s) k+1 ds
2πi
s
(6.18)
L
= S(H) + O(e−c

√

log R

).

Equation (6.2) now follows from this and (6.5).
Remark. The exponent 1/2 in the restriction k ≪ (log R)1/2−η0 is not significant.
Using Vinogradov’s zero-free region for ζ(s) we could replace 1/2 by 3/5.
7. First Part of the Proof of Proposition 1
Let
(7.1)

H = H1 ∪ H2 , |H1 | = k1 , |H2 | = k2 , k = k1 + k2 ,
r = |H1 ∩ H2 |, M = k1 + k2 + ℓ1 + ℓ2 .

Thus |H| = k − r. We prove Proposition 1 in the following sharper form.

20

D. A. GOLDSTON, J. PINTZ, AND C. Y. YILDIRIM

Proposition 1′ . Let h ≪ RC , where C is any positive fixed constant. As R, N →
∞, we have

(7.2)


X
ℓ1 + ℓ2 (log R)r+ℓ1 +ℓ2
S(H)N
ΛR (n; H1 , ℓ1 )ΛR (n; H2 , ℓ2 ) =
(r + ℓ1 + ℓ2 )!
ℓ1
n≤N

+N

r+ℓ
1 +ℓ2
X
j=1

Dj (ℓ1 , ℓ2 , H1 , H2 )(log R)r+ℓ1 +ℓ2 −j



√
+ OM N e−c log R + O(R2 (3 log R)3k+M ),

where the Dj (ℓ1 , ℓ2 , H1 , H2 )’s are functions independent of R and N which satisfy
the bound
′

Dj (ℓ1 , ℓ2 , H1 , H2 ) ≪M (log U )Cj ≪M (log log 10h)Cj

(7.3)

where U is defined in (6.14) and Cj and Cj′ are two positive constants depending
on M .
Proof. We can assume that both H1 and H2 are non-empty since the case where
one of these sets is empty can be covered in the same way as we did in case of ℓ = 0
in Section 6. Thus k ≥ 2 and we have
(7.4)
N
X
SR (N ; H1 , H2 , ℓ1 , ℓ2 ) :=
ΛR (n; H1 , ℓ1 )ΛR (n; H2 , ℓ2 )
n=1

=

k +ℓ 
k +ℓ

X
R 2 2
R 1 1
1
log
µ(d)µ(e) log
(k1 + ℓ1 )!(k2 + ℓ2 )!
d
e
d,e≤R

X

1.

1≤n≤N
d|PH1 (n)
e|PH2 (n)

For the inner sum, we let d = a1 a12 , e = a2 a12 where (d, e) = a12 . Thus a1 , a2 ,
and a12 are pairwise relatively prime, and the divisibility conditions d|PH1 (n) and
e|PH2 (n) become a1 |PH1 (n), a2 |PH2 (n), a12 |PH1 (n), and a12 |PH2 (n). As in Section
6, we get νa1 (H1 ) solutions for n modulo a1 , and νa2 (H2 ) solutions for n modulo
a2 . If p|a12 , then from the two divisibility conditions we have νp (H1 (p) ∩ H2 (p))
solutions for n modulo p, where
H(p) = {h′ 1 , . . . , h′ νp (H) : h′ j ≡ hi ∈ H for some i, 1 ≤ h′ j ≤ p}
Notice that H(p) = H if p > h . Alternatively, we can avoid this definition which
is necessary only for small primes by defining
(7.5)

ν p (H1 ∩ H2 ) := νp (H1 (p) ∩ H2 (p)) := νp (H1 ) + νp (H2 ) − νp (H)

and then extend this definition to squarefree numbers by multiplicativity.6 Thus
we see that


X
N
1 = νa1 (H1 )νa2 (H2 )ν a12 (H1 ∩ H2 )
+ O(1) ,
a1 a2 a12
1≤n≤N
d|PH1 (n)
e|PH2 (n)

6We are making a convention here that for ν we take intersections modulo p.
p

PRIMES IN TUPLES I

and have
(7.6)
SR (N ; ℓ1 , ℓ2 , H1 , H2 )
=

N
(k1 + ℓ1 )!(k2 + ℓ2 )!




M
+O
(log R)

21

X′ µ(a1 )µ(a2 )µ(a12 )2 νa (H1 )νa (H2 )ν a (H1 ∩ H2 )
2
1
12
a1 a2 a12

a1 a12 ≤R
a2 a12 ≤R

X′

a1 a12 ≤R
a2 a12 ≤R


× log

R
a1 a12

k1 +ℓ1 

R
log
a2 a12

k2 +ℓ2




µ(a1 )2 µ(a2 )2 µ(a12 )2 νa1 (H1 )νa2 (H2 )ν a12 (H1 ∩ H2 )


= N TR (ℓ1 , ℓ2 ; H1 , H2 ) + O(R2 (3 log R)3k+M ),
P′
indicates the summands are pairwise relatively prime. Notice that by
where
Lemma 2, the error term was bounded by
X♭ X
≪ (log R)M
dk (q)
q≤R2 q=a1 a2 a12

X♭

= (log R)M

d3 (q)dk (q)

q≤R2

= (log R)M

X♭

d3k (q)

q≤R2

≪ R2 (3 log R)3k+M .
By (6.6), we have
(7.7)

TR (ℓ1 , ℓ2 ; H1 , H2 ) =

1
(2πi)2

Z Z

F (s1 , s2 )

Rs1

Rs2

s1 k1 +ℓ1 +1 s2 k2 +ℓ2 +1

ds1 ds2 ,

(1)(1)

where, by letting sj = σj + itj and assuming σ1 , σ2 > 0,
X′
µ(a1 )µ(a2 )µ(a12 )2 νa1 (H1 )νa2 (H2 )ν a12 (H1 ∩ H2 )
F (s1 , s2 ) =
a1 1+s1 a2 1+s2 a12 1+s1 +s2
1≤a1 ,a2 ,a12 <∞
(7.8)


Y
νp (H1 ) νp (H2 ) ν p (H1 ∩ H2 )
=
1 − 1+s1 − 1+s2 +
.
p
p
p1+s1 +s2
p

Since for all p > h we have νp (H1 ) = k1 , νp (H2 ) = k2 , and νp (H1 ∩ H2 ) = r, we
factor out the dominant zeta-factors and write
ζ(1 + s1 + s2 )r
,
(7.9)
F (s1 , s2 ) = GH1 ,H2 (s1 , s2 )
ζ(1 + s1 )k1 ζ(1 + s2 )k2

where by (5.1)
(7.10)
GH1 ,H2 (s1 , s2 ) =



Y 1 −

p

νp (H1 )
p1+s1

−

1−


1−
k2

ν (H ∩H )
+ pp1+s11 +s22
k1 
1
1
1 − p1+s
2
p1+s1

νp (H2 )
p1+s2

1

p1+s1 +s2

r 



22

D. A. GOLDSTON, J. PINTZ, AND C. Y. YILDIRIM

is analytic and uniformly bounded for σ1 , σ2 > −1/4 + δ, for any fixed δ > 0. Also,
from (2.2), (7.1), and (7.5) we see immediately that
GH1 ,H2 (0, 0) = S(H).

(7.11)

Furthermore, the same argument leading to (6.16) shows that for s1 , s2 on L or to
the right of L
(7.12)

GH1 ,H2 (s1 , s2 ) ≪ exp(CkU δ1 +δ2 log log U ).

with δi = − min(σi , 0) and U defined in (6.14). Thus for s1 and s2 on L or to the
right of L we have
(7.13)


2k
1
δ1 +δ2
.
log log U ) log(2+|t1 |) log(2+|t2 |)
max 1,
F (s1 , s2 ) ≪ exp(CkU
|s1 + s2 |r
The integrand of (7.7) vanishes as either |t1 | → ∞ or |t2 | → ∞, σ1 , σ2 ∈ [−c, 1].
We define
(7.14)

W (s) := sζ(1 + s)

and
(7.15)

D(s1 , s2 ) = GH1 ,H2 (s1 , s2 )

so that
1
(7.16) TR (ℓ1 , ℓ2 ; H1 , H2 ) =
(2πi)2

Z Z

W (s1 + s2 )r
,
W (s1 )k1 W (s2 )k2

D(s1 , s2 )

Rs1 +s2
ds1 ds2 .
s1 ℓ1 +1 s2 ℓ2 +1 (s1 + s2 )r

(1) (1)

To complete the proof of Proposition 1, we need to evaluate this integral. We
will also need to evaluate a similar integral in the proof of Proposition 2, where
the parameters k1 , k2 , and r have several slightly different relationships with H1
and H2 , and G is slightly altered. Therefore we change notation to handle these
situations simultaneously.
8. Completion of the proof of Proposition 1: Evaluating an integral
Let
(8.1)

TR∗ (a, b, d, u, v, h) :=

1
(2πi)2

Z Z

(1) (1)

D(s1 , s2 )Rs1 +s2
u+1 v+1
s1 s2 (s1 + s2 )d

ds1 ds2 ,

where
(8.2)

D(s1 , s2 ) =

G(s1 , s2 )W d (s1 + s2 )
W a (s1 )W b (s2 )

and W is from (7.14). We assume G(s1 , s2 ) is regular on L and to the right of L
and satisfies the bound
(8.3)

G(s1 , s2 ) ≪M exp(CM U δ1 +δ2 log log U ),

where U = CM 2 log(2h).

Lemma 3. Suppose that
(8.4)

0 ≤ a, b, d, u, v ≤ M,

a + u ≥ 1,

b + v ≥ 1,

d ≤ min(a, b),

PRIMES IN TUPLES I

23

where M is a large constant and our estimates may depend on M . Let h ≪ RC ,
with C any positive fixed constant. Then we have, as R → ∞,


u + v (log R)u+v+d
∗
G(0, 0)
TR (a, b, d,u, v, h) :=
(u + v + d)!
u
(8.5)
u+v+d
 √

X
Dj (a, b, d, u, v, h)(log R)u+v+d−j + OM e−c log R ,
+
j=1

where the Dj (a, b, d, u, v, h)’s are functions independent of R which satisfy the bound
′

Dj (a, b, d, u, v, h) ≪M (log U )Cj ≪M (log log 10h)Cj

(8.6)

for some positive constants Cj , Cj′ depending on M .
Proof. As in Section 7, we see the integrand in (8.1) vanishes as |t1 | → ∞ or
|t2 | → ∞. We first shift the contour (1) for the integral over s1 to L, passing a pole
at s1 = 0, and obtain
(8.7)


Z
ZZ
D(s1 , s2 )Rs1 +s2
1
D(s1 , s2 )Rs1 +s2
1
∗
Res u+1 v+1
ds
+
ds1 ds2 .
TR =
2
2πi s1 =0 s1 s2 (s1 + s2 )d
(2πi)2
su+1
sv+1
(s1 + s2 )d
1
2
(1)

(1) L

In the first term, we move the contour over s2 along (1) to L, and pass a pole at
s2 = 0. For the second term, after interchanging the order of integration, we move
the contour (1) to the left to L passing poles at s2 = −s1 and s2 = 0. We thus
obtain
(8.8)



Z
1
D(s1 , s2 )Rs1 +s2
D(s1 , s2 )Rs1 +s2
+
ds2
Res
s2 =0 s1 =0 su+1 sv+1 (s1 + s2 )d
2πi s1 =0 su+1
sv+1
(s1 + s2 )d
1
2
1
2
L




Z
Z
1
D(s1 , s2 )Rs1 +s2
D(s1 , s2 )Rs1 +s2
1
+
Res
Res
ds1 +
ds1
2πi s2 =0 su+1
2πi s2 =−s1 su+1
sv+1
sv+1
(s1 + s2 )d
(s1 + s2 )d
1
1
2
2
L
L
Z Z
D(s1 , s2 )Rs1 +s2
1
ds1 ds2 := I0 + I1 + I2 + I3 + I4 .
+
(2πi)2
su+1
sv+1
(s1 + s2 )d
1
2

TR∗ = Res Res

L L

We will see that the residue I0 provides the main term and some of the lower order
terms, the integral I3 provides the remaining lower order terms, and the integrals
I1 , I2 , and I4 are error terms.
We consider first I0 . At s1 = 0 there is a pole of order ≤ u + 1, and therefore 7
by Leibniz’s rule we have


u  
i
D(s1 , s2 )
1 X u
D(s1 , s2 )Rs1
u−i ∂
=
(log
R)
s1 =0 su+1 (s1 + s2 )d
u! i=0 i
∂si1 (s1 + s2 )d
1
Res

s1 =0

7If G(0, 0) = 0 then the order of the pole is u or less, but the formula we use to compute the
residue is still valid. In this situation one or more of the initial terms will have the value zero.

24

D. A. GOLDSTON, J. PINTZ, AND C. Y. YILDIRIM

and
∂i
∂si1



D(s1 , s2 )
(s1 + s2 )d



= (−1)i
s1 =0

i   j
X
i ∂
D(s1 , s2 )
+
j
∂sj1
j=1

D(0, s2 )d(d + 1) · · · (d + i − 1)
sd+i
2
(−1)i−j

d(d + 1) · · · (d + i − j − 1)
sd+i−j
2

s1 =0

,

where in case of i = j (including the case when i = j = 0 and d ≥ 0 arbitrary) the
empty product in the numerator is 1. We conclude that
u

i

X X a(i, j)(log R)u−i ∂ j
D(s1 , s2 )Rs1
=
D(s1 , s2 )
Res u+1
s1 =0 s
(s1 + s2 )d
∂sj1
sd+i−j
1
2
i=0 j=0

(8.9)

s1 =0

with a(i, j) given explicitly in the previous equations. To complete the evaluation
of I0 , we see that the (i, j)th term contributes to I0 a pole at s2 = 0 of order
v + 1 + d + i − j (or less), and therefore by Leibniz’s formula
Res

s2 =0

=

Rs2

∂j

∂sj1
sv+1+d+i−j
2

1
(v + d + i − j)!

D(s1 , s2 )
s1 =0

v+d+i−j
X
m=0



∂m ∂j
v+d+i−j
(log R)v+d+i−j−m m j D(s1 , s2 )
m
∂s2 ∂s1

.
s1 =0
s2 =0

This completes the evaluation of I0 , and we conclude
(8.10)
 m j

i v+d+i−j
u X
X
X
∂
∂
b(i, j, m)
D(s
,
s
)
I0 =
(log R)u+v+d−j−m ,
1 2
m
j
∂s
∂s
2
s1 =0
1
i=0 j=0 m=0
s2 =0

where
(8.11)

i−j

b(i, j, m) = (−1)

  

u
i
v + d + i − j d(d + 1) · · · (d + i − j − 1)
.
i
j
m
u!(v + d + i − j)!

The main term is of order (log R)u+v+d and occurs when j = m = 0. Therefore,
it is given by
!
u  
1 X u
i d(d + 1) · · · (d + i − 1)
u+v+d
.
(−1)
G(0, 0)(log R)
u! i=0 i
(v + d + i)!
It is not hard to prove that


u  
1 X u
d(d + 1) · · · (d + i − 1)
u+v
1
(8.12)
(−1)i
=
,
u! i=0 i
(v + d + i)!
(u + v + d)!
u

from which we conclude that the main term is


u+v
1
(8.13)
G(0, 0)
(log R)d+u+v .
u
(u + v + d)!

Motohashi found the following approach which avoids proving (8.12) directly and
can be used to simplify some of the previous analysis. Granville also made a similar

PRIMES IN TUPLES I

25

observation. The residue we are computing is equal to
Z Z
1
D(s1 , s2 )Rs1 +s2
ds1 ds2 ,
(2πi)2
su+1
sv+1
(s1 + s2 )d
1
2
Γ2 Γ1

where Γ1 and Γ2 are the circles |s1 | = ρ and |s2 | = 2ρ, respectively, with a small
ρ > 0. Writing s1 = s and s2 = sw, this is equal to
Z Z
1
D(s, sw)Rs(w+1)
ds dw,
(2πi)2
su+v+d+1 wv+1 (w + 1)d
Γ3 Γ1

with Γ3 the circle |w| = 2. The main term is obtained from the constant term
G(0, 0) in the Taylor expansion of D(s, sw) and, therefore, equals


Z
(log R)u+v+d u + v
(w + 1)u+v
(log R)u+v+d 1
,
dw = G(0, 0)
G(0, 0)
(u + v + d)! 2πi
wv+1
(u + v + d)!
v
Γ3

by the binomial expansion.
To complete the analysis of I0 , we only need to show that the partial derivatives of
D(s1 , s2 ) at (0, 0) satisfy the bounds given in the lemma. For this, we use Cauchy’s
estimate for derivatives which takes the form, for z = σ + it and z0 = σ0 + it0 ,
|f (j) (z0 )| ≤

(8.14)

max |f (z)|

|z−z0 |=η

j!
,
ηj

if f (z) is analytic for |z − z0 | ≤ η. In the application below we will choose
(8.15)

η=

1
,
C log U log T

where T = |s1 | + |s2 | + 3.

We see that if z0 is on L or to the right of L then the whole circle |z − z0 − 1| = η
will remain in the region (5.3) and the estimates (5.4) hold in this circle. (We
remind the reader that the generic constants c, C take different values at different
appearances.) Thus, we have for s1 , s2 on L or to the right of L,
∂m ∂j
j D(s1 , s2 )
∂sm
2 ∂s1

(8.16)

≤ j!m!(C log U )j+m (log T1 )j (log T2 )m
≪M

max

∗
|s∗
1 −s1 |≤η,|s2 −s2 |≤η

|D(s∗1 , s∗2 )|

exp(CM U δ1 +δ2 log log U )(log T1 )M (log T2 )M
max(1, |s1 + s2 |)d ,
max(c, |s1 |)a max(c, |s2 |)b

which, if max(|s1 |, |s2 |) ≤ C, reduces to
(8.17)


∂m ∂j
D(s1 , s2 ) ≪M exp CM U δ1 +δ2 log log U .
m
j
∂s2 ∂s1

In particular, we have
(8.18)

∂m ∂j
j D(s1 , s2 )
∂sm
2 ∂s1

s1 =0
s2 =0

≪M (log U )C(M) .

We conclude from (8.10), (8.13), and (8.18) that I0 provides the main term and
some of the secondary terms in Lemma 3 which satisfy the stated bound.

26

D. A. GOLDSTON, J. PINTZ, AND C. Y. YILDIRIM

We now consider I1 . By (8.9), (8.16), and Lemma 1, we have

u

I1 ≪M (log R)
(8.19)
≪M

Z

Z
L

eCMU

δ2

max(1, |s2 |)d (log(|t2 | + 3))M eCMU
|s2 |v+1+b+d

δ2

log log U

R−δ2

|ds2 |

log log U

√
(log(|t2 | + 3))M R−δ2
|ds2 | ≪M e−c log R .
v+1+b
|s2 |

L

The same bound holds for I2 since it is with relabeling equal to I1 . Further, I4 also
satisfies this bound by (7.13) (with relabeling) and Lemma 1.
Finally, we examine I3 , which only occurs if d ≥ 1. Because
(8.20)
Res

s2 =−s1



D(s1 , s2 )Rs1 +s2
su+1
sv+1
(s1 + s2 )d
1
2



=

lim

s2 →−s1

1
∂ d−1
(d − 1)! ∂s2 d−1
d−1

=



D(s1 , s2 )Rs1 +s2
s1 u+1 s2 v+1



X
1
Bi (s1 )(log R)d−1−i ,
(d − 1)! i=0

where

 i  
d − 1 X i ∂ i−j
(8.21) Bi (s1 ) =
D(s1 , s2 )
i
j ∂si−j
2
j=0

(−1)j (v + 1) · · · (v + j)
(−1)j+v+1 su+v+j+2
1

s2 =−s1

,

we have
d−1

(8.22)

I3 =

X
1
Ci (log R)d−1−i ,
(d − 1)! i=0

where
1
Ci =
2πi

(8.23)

Z

Bi (s1 ) ds1 , 0 ≤ i ≤ d − 1.

L

It remains to estimate the Ci ’s, which are independent of R but depend on h.
By (8.21) we see that the functions Bi (s1 ) tend to zero as |t1 | → ∞, −c ≤ σ ≤ 1,
and further by (8.16)
(8.24)

Bi ≪M

i
X
j=0

(log T1 )M exp CM U 2δ1 log log U


1
.
|t1 |u+v+j+2+a+b

Therefore, we may shift the contour L back to the imaginary axis with a semicircle
of radius 1/ log U centered and to the left of s1 = 0. The contribution to Ci from
the integral along the imaginary axis is
(8.25)

≪M (log U )u+v+i+a+b+1 exp(CM log log U ) ≪M (log U )C

′

(M)

.

This expression also bounds the contribution to Ci from the semicircle contour and
thus completes the evaluation of I3 . Combining our results, we obtain Lemma 3.

PRIMES IN TUPLES I

27

9. Proof of Proposition 2
We introduce some standard notation associated with (1.2) and (1.3). Let
X
x
+ E(x; q, a),
log p = [(a, q) = 1]
(9.1)
θ(x; q, a) :=
φ(q)
p≤x
p≡a(mod q)

where [S] is 1 if the statement S is true and is 0 if S is false. Next, we define
E ′ (x, q) := max
|E(x; q, a)|,
a

(9.2)

E ∗ (x, q) = max E ′ (y, q).

(a,q)=1

y≤x

In this paper we only need level of distribution results for E ′ , but usually these
results are stated in the stronger form for E ∗ . Thus, for some 1/2 ≤ ϑ ≤ 1, we
assume, given any A > 0 and ε > 0, that
X
x
.
E ∗ (x, q) ≪A,ε
(9.3)
(log x)A
ϑ−ε
q≤x

This is known to hold with ϑ = 1/2.
We prove the following stronger version of Proposition 2. Let
(9.4)

if h0 ∈
6 H,

 1 (ℓ +ℓ +1) log R
1
2
if h0 ∈ H1 and h0 6∈ H2 ,
CR (ℓ1 , ℓ2 , H1 , H2 , h0 ) =
(ℓ1 +1)(r+ℓ1 +ℓ2 +1)

 (ℓ1 +ℓ2 +2)(ℓ1 +ℓ2 +1) log R if h ∈ H ∩ H .
0
1
2
(ℓ1 +1)(ℓ2 +1)(r+ℓ1 +ℓ2 +1)

By relabeling the variables we obtain the corresponding form if h0 ∈ H2 and h0 6∈
H1 . We continue to use the notation (7.1).
Proposition 2′ . Suppose h ≪ R. Given any positive A, there is a B = B(A, M )
1
such that for R ≪M,A N 4 /(log N )B and R, N → ∞,
N
X

ΛR (n; H1 , ℓ1 )ΛR (n; H2 , ℓ2 )θ(n + h0 )

n=1

(9.5)



CR (ℓ1 , ℓ2 , H1 , H2 , h0 ) ℓ1 + ℓ2
S(H0 )N (log R)r+ℓ1 +ℓ2
=
ℓ1
(r + ℓ1 + ℓ2 )!

r
X
Dj (ℓ1 , ℓ2 , H1 , H2 , h0 )(log R)r+ℓ1 +ℓ2 −j + OM,A
+N
j=1

N
(log N )A



,

where the Dj (ℓ1 , ℓ2 , H1 , H2 , h0 )’s are functions independent of R and N which satisfy the bound
(9.6)

′

Dj (H1 , H2 , h0 ) ≪M (log U )Cj ≪M (log log 10h)Cj

for some positive constants Cj , Cj′ depending on M . Assuming that conjecture (9.3)
ϑ
holds, then (9.5) holds for R ≪M N 2 −ε and h ≤ Rε , for any given ε > 0.
Proof. We will assume that both H1 and H2 are non-empty so that k1 ≥ 1 and
k2 ≥ 1. The proof in the case when one of these sets is empty is much easier and

28

D. A. GOLDSTON, J. PINTZ, AND C. Y. YILDIRIM

may be obtained by an argument analogous to that of Section 6. We have
(9.7)
N
X
SeR (N ; H1 , H2 , ℓ1 , ℓ2 , h0 ) :=
ΛR (n; H1 , ℓ1 )ΛR (n; H2 , ℓ2 )θ(n + h0 )
n=1

1
=
(k1 + ℓ1 )!(k2 + ℓ2 )!

X

d,e≤R



R
µ(d)µ(e) log
d

k1 +ℓ1 
k +ℓ
R 2 2
log
e

X

θ(n + h0 ).

1≤n≤N
d|PH1 (n)
e|PH2 (n)

To treat the inner sum above, let d = a1 a12 and e = a2 a12 , where (d, e) = a12 ,
so that a1 , a2 , and a12 are pairwise relatively prime. As in Section 7, the
n for which d|PH1 (n) and e|PH2 (n) cover certain residue classes modulo [d, e].
If n ≡ b (mod a1 a2 a12 ) is such a residue class, then letting m = n + h0 ≡
b + h0 (mod a1 a2 a12 ), we see that this residue class contributes to the inner sum
(9.8)

X

1+h0 ≤m≤N +h0
m≡b+h0 (mod a1 a2 a12 )

θ(m) = θ(N + h0 ; a1 a2 a12 , b + h0 ) − θ(h0 ; a1 a2 a12 , b + h0 )

= [(b + h0 , a1 a2 a12 ) = 1]

N
+ E(N ; a1 a2 a12 , b + h0 ) + O(h log N ).
φ(a1 a2 a12 )

We need to determine the number of these residue classes where (b+h0 , a1 a2 a12 ) = 1
so that the main term is non-zero. If p|a1 , then b ≡ −hj (mod p) for some hj ∈ H1 ,
and therefore b + h0 ≡ h0 − hj (mod p). Thus, if h0 is distinct modulo p from all
the hj ∈ H1 , then all νp (H1 ) residue classes satisfy the relatively prime condition,
while otherwise h0 ≡ hj (mod p) for some hj ∈ H1 leaving νp (H1 ) − 1 residue classes
with a non-zero main term. We introduce the notation νp∗ (H1 0 ) for this number in
either case, where we define for a set G and integer h0
νp∗ (G) = νp (G 0 ) − 1,

(9.9)
where

G 0 = G ∪ {h0 }.

(9.10)

We extend this definition to νd∗ (H1 0 ) for squarefree numbers d by multiplicativity.
The function νd∗ is familiar in sieve theory, see [15]. A more algebraic discussion of
νd ∗ may also be found in [13, 14]. We define ν ∗d (H1 ∩H2 )0 as in (7.5).
Next, the divisibility conditions a2 |PH2 (n), a12 |PH1 (n), and a12 |PH2 (n) are handled as in Section 7 together with the above considerations. Since E(n; q, a) ≪
(log N ) if (a, q) > 1 and q ≤ N , we conclude that
X

N
θ(n + h0 ) = νa∗1 (H1 0 )νa∗2 (H2 0 )ν ∗a12 (H1 ∩H2 )0
φ(a1 a2 a12 )
(9.11)

1≤n≤N
d|PH1 (n)
e|PH2 (n)





+ O dk (a1 a2 a12 ) 

max

b
(b,a1 a2 a12 )=1

Substituting this into (9.7) we obtain



E(N ; a1 a2 a12 , b) + h(log N ) .

PRIMES IN TUPLES I

(9.12)
SeR (N ; H1 , H2 , ℓ1 , ℓ2 , h0 )
=

N
(k1 + ℓ1 )!(k2 + ℓ2 )!




M
+O
(log R)

X′

a1 a12 ≤R
a2 a12 ≤R

29

X′ µ(a1 )µ(a2 )µ(a12 )2 νa∗ (H1 0 )νa∗ (H2 0 )ν ∗a (H1 ∩H2 )0
12
2
1
φ(a1 a2 a12 )

a1 a12 ≤R
a2 a12 ≤R


× log

R
a1 a12

k1 +ℓ1 


R
log
a2 a12

k2 +ℓ2


2
M+3k+1
dk (a1 a2 a12 )E ′ (N, a1 a2 a12 )
)
 + O(hR (3 log N )


= N TeR (H1 , H2 , ℓ1 , ℓ2 , h0 ) + O (log R)M Ek (N ) + O(hR2 (3 log N )M+3k+1 ),

where the last error term was obtained using Lemma 2. To estimate the first error
term we use Lemma 2, (1.3), and the trivialpestimate E ′ (N, q) ≤ (2N/q) log N for
q ≤ N , and (9.3) to find, uniformly for k ≤ (log N )/18, that
|Ek (N )| ≤
=

X♭

b
(b,q)=1

q≤R2

X♭

X

dk (q) max E(N ; q, b)

1

q=a1 a2 a12

dk (q)d3 (q)E ′ (N, q)

q≤R2

v
s
u X♭
d3k (q)2 X
u
q(E ′ (N, q))2
≤t
q
2
2

(9.13)

q≤R

q≤R

sX
q
p
2
9k
2N log N
E ′ (N, q)
≤ (log N )
q≤R2

≪ N (log N )(9k

2

+1−A)/2

,

1

provided R2 ≪ N 2 /(log N )B . On relabeling, we conclude that given any positive
1

integers A and M there is a positive constant B = B(A, M ) so that for R ≪
and h ≤ R,
(9.14)

SeR (N ; H1 , H2 , ℓ1 , ℓ2 , h0 ) = N TeR (H1 , H2 , ℓ1 , ℓ2 , h0 ) + OM



N4
(log N )B

N
(log N )A



.

Using (9.3) with any ϑ > 1/2, we see that (9.14) holds for the longer range R ≪M
ϑ
N 2 −ε , h ≪ N ε .
Returning to the main term in (9.12), we have by (6.6) that
(9.15) TeR (H1 , H2 , ℓ1 , ℓ2 , h0 ) =

1
(2πi)2

Z Z

(1) (1)

F (s1 , s2 )

Rs1
s1

k1 +ℓ1 +1

Rs2
s2

k2 +ℓ2 +1

ds1 ds2 ,


30

D. A. GOLDSTON, J. PINTZ, AND C. Y. YILDIRIM

where, by letting sj = σj + itj and assuming σ1 , σ2 > 0,
(9.16)
µ(a1 )µ(a2 )µ(a12 )2 νa∗1 (H1 0 )νa∗2 (H2 0 )ν ∗a12 (H1 ∩H2 )0
F (s1 , s2 ) =
φ(a1 )a1 s1 φ(a2 )a2 s2 φ(a12 )a12 s1 +s2
1≤a1 ,a2 ,a12 <∞

Y
ν ∗p (H1 ∩H2 )0
νp∗ (H2 0 )
νp∗ (H1 0 )
.
−
+
=
1−
(p − 1)ps1
(p − 1)ps2
(p − 1)ps1 +s2
p
X′


We now consider three cases.
Case 1. Suppose h0 6∈ H. Then we have, for p > h,
νp∗ (H1 0 ) = k1 ,

νp∗ (H2 0 ) = k2 ,


ν ∗p (H1 ∩H2 )0 = r.

Therefore in this case we define GH1 ,H2 (s1 , s2 ) by
(9.17)

F (s1 , s2 ) = GH1 ,H2 (s1 , s2 )

ζ(1 + s1 + s2 )r
.
ζ(1 + s1 )k1 ζ(1 + s2 )k2

Case 2. Suppose h0 ∈ H1 but h0 ∈
6 H2 . (By relabeling this also covers the case
where h0 ∈ H2 and h0 6∈ H1 .) Then for p > h

νp∗ (H1 0 ) = k1 − 1, νp∗ (H2 0 ) = k2 , ν ∗p (H1 ∩H2 )0 = r.
Therefore, we define GH1 ,H2 (s1 , s2 ) by

(9.18)

F (s1 , s2 ) = GH1 ,H2 (s1 , s2 )

ζ(1 + s1 + s2 )r
.
ζ(1 + s1 )k1 −1 ζ(1 + s2 )k2

Case 3. Suppose h0 ∈ H1 ∩ H2 . Then for p > h
νp∗ (H1 0 ) = k1 − 1,

νp∗ (H2 0 ) = k2 − 1,

Thus, we define GH1 ,H2 (s1 , s2 ) by
(9.19)

F (s1 , s2 ) = GH1 ,H2 (s1 , s2 )


ν ∗p (H1 ∩H2 )0 = r − 1.

ζ(1 + s1 + s2 )r−1
.
ζ(1 + s1 )k1 −1 ζ(1 + s2 )k2 −1

In each case, G is analytic and uniformly bounded for σ1 , σ2 > −c, with any c < 1/4.
We now show that in all three cases
(9.20)

GH1 ,H2 (0, 0) = S(H0 ).

Notice that in the second two cases we have H0 = H. By (5.1), (7.5), (9.9), and
(9.16), we find in all three cases
(9.21)
GH1 ,H2 (0, 0)

−a(H1 ,H2 ,h0 )
Y
1
νp (H1 0 ) + νp (H2 0 ) − ν p ((H1 ∩H2 )0 ) − 1
1−
=
1−
p−1
p
p

−a(H1 ,H2 ,h0 )
Y
νp (H0 ) − 1
1
=
1−
1−
,
p−1
p
p

PRIMES IN TUPLES I

31

where in Case 1 a(H1 , H2 , h0 ) = k1 + k2 − r = k − r, in Case 2 a(H1 , H2 , h0 ) =
(k1 −1)+k2 −r = k−r−1, and in Case 3 a(H1 , H2 , h0 ) = (k1 −1)+(k2 −1)−(r−1) =
k − r − 1. Hence, in Case 1 we have
−(k−r)
Y  p − νp (H0 )  
1
1−
GH1 ,H2 (0, 0) =
p−1
p
p



−(k−r+1)
Y
(9.22)
νp (H0 )
1
=
1−
1−
p
p
p
= S(H0 ),

while in Cases 2 and 3 we have

(9.23)

Y  p − νp (H)  
1−
GH1 ,H2 (0, 0) =
p−1
p

Y
νp (H)
=
1−
1−
p
p

1
p
1
p

−(k−r−1)

−(k−r)

= S(H) (= S(H0 )).

We are now ready to evaluate TR (H1 , H2 , ℓ1 , ℓ2 , h0 ). There are two differences
between the functions F and G that appear in (9.16)–(9.19) and the earlier (7.8)–
(7.10). The first difference is that a factor of p in the denominator of the Euler
product in (7.8) has been replaced by p−1, which only effects the value of constants
in calculations. The second difference is the relationship between k1 , k2 , and r,
which effects the residue calculations of the main terms. However, the analysis of
lower order terms and the error analysis is essentially unchanged and, therefore, we
only need to examine the main terms. We use Lemma 3 here to cover all of the
cases. Taking into account (9.17)–(9.19) we have in Case 1 that a = k1 , b = k2 , d =
r, u = ℓ1 , v = ℓ2 ; in Case 2 that a = k1 − 1, b = k2 , d = r, u = ℓ1 + 1, v = ℓ2 ; and in
Case 3 that a = k1 − 1, b = k2 − 1, d = r − 1, u = ℓ1 + 1, v = ℓ2 + 1. By (9.22) and
(9.23), the proof of Theorems 2 and 2′ is thus complete.
10. Proof of Theorem 3
For convenience, we agree in our notation below that we consider every set of
size k with a multiplicity k! according to all permutations of the elements hi ∈ H,
unless mentioned otherwise. While unconventional, this will clarify some of the
calculations.
To prove Theorem 3 we consider in place of (3.5)
(10.1)
SR (N, k, ℓ, h, ν)
1
:=
N h2k+1

 X
2N
X

n=N +1

fR (N, k, ℓ, h) − ν
=M

1≤h0 ≤h

θ(n + h0 ) − ν log 3N

log 3N
MR (N, k, ℓ, h),
h



X

H⊂{1,2,...,h}
|H|=k

!2

ΛR (n; H, ℓ)

32

D. A. GOLDSTON, J. PINTZ, AND C. Y. YILDIRIM

where
1
MR (N, k, ℓ, h) =
N h2k

(10.2)

and
(10.3)
fR (N, k, ℓ, h) =
M

1
N h2k+1

2N
X

n=N +1

 X
2N
X

n=N +1

1≤h0 ≤h

X

H⊂{1,2,...,h}
|H|=k


θ(n + h0 )

!2

ΛR (n; H, ℓ)

X

H⊂{1,2,...,h}
|H|=k

!2

ΛR (n; H, ℓ)

.

fR we multiply out the sum and apply Propositions 1 and 2.
To evaluate MR and M
We need to group the pairs of sets H1 and H2 according to size of the intersection
r = |H1 ∩ H2 |, and thus |H| = |H1 ∪ H2 | = 2k − r. Let us choose now a set H and
here exceptionally we disregard the permutation of the elements in H. (However for
H1 and H2 we take into account
all permutations.) Given the set H of size 2k − r,

we can choose H1 in 2k−r
ways.
Afterwards, we can choose the intersection set
k

k
in r ways. Finally, we can arrange the elements both in H1 and H2 in k! ways.
This gives

 
 2
2k − r
k
k
(10.4)
(k!)2 = (2k − r)!
r!
k
r
r
choices for H1 and H2 , taking into account the permutation of the elements in H1
and H2 . If we consider in the summation every union set H of size j just once,
independently of the arrangement of the elements, then Gallagher’s theorem (3.7)
may be formulated as
X∗

(10.5)

where
letting

P∗

H⊂{1,2,...,h}
|H|=j

S(H) ∼

hj
,
j!

indicates every set is counted just once. Applying this, we obtain on

(10.6)

x=

log R
,
h

and using Proposition 1
(10.7)
MR (N, k, ℓ, h) ∼

 2  
k
1 X
k
2ℓ (log R)2ℓ+r
N
(2k
−
r)!
r!
2k
N h r=0
(r + 2ℓ)!
r
ℓ

 
k  2
X
k
xr
2ℓ
2ℓ
∼
(log R)
.
r (r + 1) . . . (r + 2ℓ)
ℓ
r=0

|H|=2k−r

By Proposition 2 and (10.5) we have
(10.8)

fR (N, k, ℓ, h) ∼
M

X∗

 2
k
X
k
1
r! Zr ,
(2k − r)!
2k+1
Nh
r
r=0

S(H)

PRIMES IN TUPLES I

where abbreviating a =

2ℓ+1
ℓ+1

=

(10.9)


2ℓ+1
ℓ+1


2ℓ −1
ℓ

=

33


1 2ℓ+2
2 ℓ+1


2ℓ −1
,
ℓ

we have

(
 
X∗
log R
2ℓ (log R)2ℓ+r
2a
r
S(H)N
Zr : =
(r + 2ℓ)!
r + 2ℓ + 1
ℓ
|H|=2k−r

X∗

+ (2k − 2r)

|H|=2k−r

log R
a
S(H)N +
r + 2ℓ + 1

X∗

|H|=2k−r

h
X

h0 =1
h0 ∈H
/

0

S(H )N

)

 2k−r

 
2ak log R
h
2k − r + 1 2k−r+1
2ℓ (log R)2ℓ+r
,
∼N
+
h
(r + 2ℓ)!
(2k − r)! r + 2ℓ + 1 (2k − r + 1)!
ℓ
where in the last sum we took into account which element of H0 is h0 , which can
be chosen in 2k − r + 1 ways. Thus we obtain
(10.10)

 

k  2
X
2ak
k
xr
2ℓ
2ℓ
f
MR (N, k, ℓ, h) ∼
x+1 .
(log R)
r (r + 1) . . . (r + 2ℓ) r + 2ℓ + 1
ℓ
r=0

We conclude, on introducing the parameters
(10.11)
that
(10.12)

ϕ=

1
, (so a = 2 − ϕ),
ℓ+1

Θ=

log R
, (so R = (3N )Θ ),
log 3N

 
2ℓ
SR (N, k, ℓ, h, ν) ∼
(log R)2ℓ Pk,ℓ,ν (x),
ℓ

where
(10.13)

Pk,ℓ,ν (x) =

k  2
X
k
r=0

Let
(10.14)

r

xr
(r + 1) · · · (r + 2ℓ)

h = λ log 3N,




4(1 − ϕ2 )k
ν
1+x
.
−
r + 2ℓ + 1 Θ

so that x =

Θ
.
λ

The analysis of when S > 0 now depends on the polynomial Pk,ℓ,ν (x). We examine
this polynomial as k, ℓ → ∞ in such a way that ℓ = o(k). In the first place, the size
of the terms of the polynomial are determined by the factor
 2
k
g(r) =
xr ,
r
and since g(r) > g(r − 1) is equivalent to
r<

k+1
1 + √1x

we should expect that the polynomial is controlled by terms with r close to k/(z+1),
where
1
(10.15)
z= √ .
x

34

D. A. GOLDSTON, J. PINTZ, AND C. Y. YILDIRIM

Consider now the sign of each term. For small x, the terms in the polynomial are
positive, but they become negative when


4(1 − ϕ2 )k
ν
1+x
< 0.
−
r + 2ℓ + 1 Θ
When r = k/(z + 1) and letting k, ℓ → ∞, ℓ = o(k), we have heuristically

!
ν
1+x
−
k
Θ
z+1
ν
1
.
= 2 (z + 2)2 −
z
Θ
pν
Therefore, the terms will be positive for r in this range if z < Θ
− 2, which is
√ 2
√
equivalent to λ < ( ν − 2 Θ) . Since we can take Θ as close to ϑ/2 as we wish,
this will imply Theorem 3. To make this argument precise, we choose r0 slightly
smaller than where g(r) is maximal, and prove that all the negative terms together
contribute less then the single term r0 , which will be positive for z and thus λ close
to the values above.
For the proof, we may assume ν ≥ 2 and 1/2 ≤ ϑ0 ≤ 1 are fixed, with ϑ0 < 1
in case of ν = 2. (The case ν = 1 is covered by Theorem 2, and the case ν = 2,
ϑ0 = 1, E2 = 0 is covered by (1.11) proved in Section 3.) First, we choose ε0
as a sufficiently small fixed positive number. We will choose ℓ sufficiently large,
depending on ν, ϑ0 , ε0 , and set


(10.16)

4(1 − ϕ2 )k
ν
−
r + 2ℓ + 1 Θ

k = (ℓ + 1)2 = ϕ−2 ,



1
≈1+ 2
z

4k

ℓ > ℓ0 (ν, ϑ0 , ε0 ), so that ϕ < ϕ0 (ν, ϑ0 , ε0 ).

Furthermore, we choose
(10.17)

Θ=

ϑ0 (1 − ϕ)
log R
=
,
log 3N
2

and (because of our assumptions on ν) we can define
(10.18)

z0 :=

Thus, we see that
(10.19)

4k

1
1+ 2
z0

k
z0 +1

p
2ν/ϑ0 − 2 > 0.

2ν
−
ϑ0

!

=

1
2ν 
(z0 + 2)2 −
= 0,
z0 2
ϑ0

Let us choose now
(10.20)

r0 =




k+1
,
z0 + 1

r1 = r0 + ϕk = r0 + ℓ + 1,

and put
(10.21)

z = z0 (1 + ε0 ).

PRIMES IN TUPLES I

35

The linear factor in each term of Pk,ℓ,ν (x) is, for r0 ≤ r ≤ r1 ,

(10.22)

1+x



4(1 − ϕ2 )k
ν
−
r + 2ℓ + 1 Θ



1
4k(1 + O(ϕ))
2ν
=1+ 2
−
z0 (1 + ε0 )2 z k+1 + O(kϕ) ϑ0 (1 − ϕ)
0
√
−z02 + O( νϕ) + O(νϕ)
=1+
z02 (1 + ε0 )2
> c(ν, ϑ0 )ε0 if ϕ < ϕ0 (ν, ϑ, ε0 ),

!

where c(ν, ϑ0 ) > 0 is a constant. Letting
 2
k
xr
(10.23)
f (r) :=
,
r (r + 1) . . . (r + 2ℓ)

we have, for any r2 > r1 ,
Y  k + 1 − r 1 2
Y  k + 1 − r 1 2
f (r2 )
(10.24)
<
·
<
·
f (r0 )
r
z
r
z
r0 <r≤r2
r0 <r≤r1

2ℓ
 2ℓ 
k+1
z0 + 1 − 1
1
<
≤
< e−ε0 ℓ .
−1
r0 + 1
z
z0 (1 + ε0 )

Thus, the total contribution in absolute value of the negative terms of Pk,ℓ,ν (x) will
be, for sufficiently large ℓ, at most


4(k + ν)
e−ε0 ℓ f (r0 ) < e−ε0 ℓ/2 f (r0 ),
(10.25)
k 1+
z2
while that of the single term r0 will be by (10.22) at least

(10.26)

c(ν, ϑ0 )ε0 f (r0 ) > e−ε0 ℓ/2 f (r0 ) if ℓ > ℓ0 (ν, ϑ0 , ε0 ).

This shows that Pk,ℓ,ν (x) > 0. Hence, we must have at least ν + 1 primes in some
interval
(10.27)

[n + 1, n + h] = [n + 1, n + λ log 3N ],

where

n ∈ [N + 1, 2N ],

√
2
p
ϑ0 2
z0 (1 + ε0 )2 = (1 + ε0 )2
ν − 2ϑ0 .
2
Since ε0 can be chosen arbitrarily small, this proves Theorem 3.

(10.28)

λ = Θz 2 <

References
[1] Paul Bateman and Roger A. Horn, A heuristic formula concerning the distribution of prime
numbers, Math. Comp. 16 (1962) , 363–367.
[2] E. Bombieri and H. Davenport, Small differences between prime numbers, Proc. Roy. Soc.
Ser. A 293 (1966), 1–18.
[3] E. Bombieri, J. B. Friedlander, and H. Iwaniec, Primes in arithmetic progressions to large
moduli. III. J. Am. Math. Soc. 2, No.2, 215-223 (1989).
[4] H. Davenport, Multiplicative Number Theory, Second Edition, Revised by Hugh L. Montgomery, Springer, Berlin, Heidelberg, New York, 1980.
[5] P.D.T.A Elliott and H. Halberstam, A conjecture in prime number theory, Symposia Mathematica 4 (INDAM, Rome, 1968/69) 59–72, Academic Press, London.
[6] Thomas
J.
Engelsma,
k-tuple
permissible
patterns,
February
2005,
http://www.opertech.com/primes/k-tuples.html
[7] P. Erdős, The difference of consecutive primes, Duke Math. J. 6 (1940), 438–441.

36

D. A. GOLDSTON, J. PINTZ, AND C. Y. YILDIRIM

[8] Kevin Ford, Zero-free region for the Riemann zeta function, Number Theory for the Millennium II, (Urbana, IL., 2000), 25–56, A. K. Peters, Natick, MA 2002.
[9] P. X. Gallagher, On the distribution of primes in short intervals, Mathematika 23 (1976),
4–9.
[10] D. A. Goldston, On Bombieri and Davenport’s theorem concerning small gaps between
primes, Mathematika 39 (1992), 10–17.
[11] D. A. Goldston and C. Y. Yıldırım, Higher correlations of divisor sums related to primes.
I: Triple correlations, Integers 3 (2003), A5, 66 pp. (electronic).
[12] D. A. Goldston, C. Y. Yıldırım, Higher correlations of divisor sums III: Small gaps between
primes, preprint.
[13] D. A. Goldston, S. W. Graham, J. Pintz, C. Y. Yıldırım, Small gaps between primes and
almost primes, preprint.
[14] D. A. Goldston, Y. Motohashi, J. Pintz, C. Y. Yıldırım, Small gaps between primes exist,
preprint.
[15] H. Halberstam and H. -E. Richert, Sieve methods, Academic Press, London, New York, 1975.
[16] G. H. Hardy and J. E. Littlewood, Some problems of ‘Partitio Numerorum’: III On the
expression of a number as a sum of primes, Acta Math. 44 (1923), 1–70.
[17] G. H. Hardy and J. E. Littlewood, unpublished manuscript, see [25].
[18] D. R. Heath-Brown, Almost-prime k-tuples, Mathematika 44 (1997), 245–266.
[19] M. N. Huxley, On the differences of primes in arithmetical progressions, Acta Arith. 15
(1968/69), 367–392.
[20] M. N. Huxley, Small differences between consecutive primes II. Mathematika 24 (1977), 142–
152.
[21] M. N. Huxley, An application of the Fouvry-Iwaniec theorem, Acta Arith. 43 (1984), 441–443.
[22] H. Maier, Small differences between prime numbers, Michigan Math. J. 35 (1988), 323–344.
[23] H. L. Montgomery, Topics in Multiplicative Number Theory, Lecture Notes in Mathematics,
Springer, Berlin, Heidelberg, New York, 1971.
[24] G. Z. Pilt’ai, On the size of the difference between consecutive primes, Issledovania po teorii
chisel, 4 (1972), 73–79.
[25] R. A. Rankin, The difference between consecutive prime numbers. II, Proc. Cambridge Philos.
Soc. 36 (1940), 255–266.
[26] G. Ricci, Sull’andamento della differenza di numeri primi consecutivi, Riv. Mat. Univ. Parma
5 (1954), 3–54.
[27] A. Schinzel and W. Sierpinski, Sur certaines hypotheses concernant les nombres premiers,
Acta Arith. 4 (1958), 185–208, erratum 5 (1958), 259.
[28] A. Selberg, Collected Papers, Vol. II, Springer, Berlin, 1991.
[29] K. Soundararajan, Notes on Goldston-Pintz-Yildirim, unpublished.
[30] E. C. Titchmarsh, The theory of the Riemann zeta-function, Second edition, Edited and with
a preface by D. R. Heath-Brown, The Clarendon Press, Oxford University Press, New York,
1986.
[31] S. Uchiyama, On the difference between consecutive prime numbers, Acta Arith. 27 (1975),
153–157
Department of Mathematics, San Jose State University, San Jose, CA 95192, USA
E-mail address: goldston@math.sjsu.edu
Rényi Mathematical Institute of the Hungarian Academy of Sciences, H-1364 Budapest, P.O.B. 127, Hungary
E-mail address: pintz@renyi.hu
Department of Mathematics, Bog̃aziçi University, Istanbul 34342 & Feza Gürsey Enstitüsü, Çengelköy, Istanbul, P.K. 6, 81220, Turkey
E-mail address: yalciny@boun.edu.tr