All of lore.kernel.org
 help / color / mirror / Atom feed
* fixing GDDR5 reclocking on kepler cards
@ 2015-10-12 20:18 Karol Herbst
       [not found] ` <1444681109-380-1-git-send-email-nouveau-lIBOoy2+GI7scQ4cX5LuPg@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Karol Herbst @ 2015-10-12 20:18 UTC (permalink / raw)
  To: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

this is my first patch on the list through git send-mail and I hope everything
is set up right, sorry for the noise here, but I don't want to try with an
empty mail :)

as the subject already says, this patch fixes one of the more serious issues
while reclocking gddr5 on kepler cards. It works for me and for a bunch of others
I met on IRC.

Karol Herbst (1):
  pll/gk104: fix PLL instability due to bad configuration with gddr5

 drm/nouveau/nvkm/subdev/clk/Kbuild     |  1 +
 drm/nouveau/nvkm/subdev/clk/pll.h      |  5 +++
 drm/nouveau/nvkm/subdev/clk/pllgk104.c | 77 ++++++++++++++++++++++++++++++++++
 drm/nouveau/nvkm/subdev/clk/pllgt215.c |  6 +++
 drm/nouveau/nvkm/subdev/fb/ramgk104.c  | 39 +++++++----------
 5 files changed, 105 insertions(+), 23 deletions(-)
 create mode 100644 drm/nouveau/nvkm/subdev/clk/pllgk104.c

-- 
2.6.1

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] pll/gk104: fix PLL instability due to bad configuration with gddr5
       [not found] ` <1444681109-380-1-git-send-email-nouveau-lIBOoy2+GI7scQ4cX5LuPg@public.gmane.org>
@ 2015-10-12 20:18   ` Karol Herbst
       [not found]     ` <1444681109-380-2-git-send-email-nouveau-lIBOoy2+GI7scQ4cX5LuPg@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Karol Herbst @ 2015-10-12 20:18 UTC (permalink / raw)
  To: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

this patch uses an approach closer to the nvidia driver to configure both PLLs
for high gddr5 memory clocks (usually above 2400MHz)

previously nouveau used the one PLL as it was used for the lower clocks and just
adjusted the second PLL to get as close as possible to the requested clock.
This means for my card, that I got a 4050 MHz clock allthough 4008 MHz was
requested.

now the driver iterates over a list of PLL configuration also used by the
nvidia driver and then adjust the second PLL to get near the requested clock.
Also it hold to some restriction I found while analyzing the PLL configurations

This won't fix all gddr5 high clock issues itself, but it should be fine on
hybrid gpu systems as found on many laptops these days. Also switching while
normal desktop usage should be a lot more stable than before.

Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
---
 drm/nouveau/nvkm/subdev/clk/Kbuild     |  1 +
 drm/nouveau/nvkm/subdev/clk/pll.h      |  5 +++
 drm/nouveau/nvkm/subdev/clk/pllgk104.c | 77 ++++++++++++++++++++++++++++++++++
 drm/nouveau/nvkm/subdev/clk/pllgt215.c |  6 +++
 drm/nouveau/nvkm/subdev/fb/ramgk104.c  | 39 +++++++----------
 5 files changed, 105 insertions(+), 23 deletions(-)
 create mode 100644 drm/nouveau/nvkm/subdev/clk/pllgk104.c

diff --git a/drm/nouveau/nvkm/subdev/clk/Kbuild b/drm/nouveau/nvkm/subdev/clk/Kbuild
index ed7717b..5ac23fe 100644
--- a/drm/nouveau/nvkm/subdev/clk/Kbuild
+++ b/drm/nouveau/nvkm/subdev/clk/Kbuild
@@ -11,3 +11,4 @@ nvkm-y += nvkm/subdev/clk/gk20a.o
 
 nvkm-y += nvkm/subdev/clk/pllnv04.o
 nvkm-y += nvkm/subdev/clk/pllgt215.o
+nvkm-y += nvkm/subdev/clk/pllgk104.o
diff --git a/drm/nouveau/nvkm/subdev/clk/pll.h b/drm/nouveau/nvkm/subdev/clk/pll.h
index 44020a3..1f5e0a0 100644
--- a/drm/nouveau/nvkm/subdev/clk/pll.h
+++ b/drm/nouveau/nvkm/subdev/clk/pll.h
@@ -4,8 +4,13 @@
 struct nvkm_subdev;
 struct nvbios_pll;
 
+int calc_pll_clock(int fN, int M, int N, int P, int clk);
+
 int nv04_pll_calc(struct nvkm_subdev *, struct nvbios_pll *, u32 freq,
 		  int *N1, int *M1, int *N2, int *M2, int *P);
 int gt215_pll_calc(struct nvkm_subdev *, struct nvbios_pll *, u32 freq,
 		  int *N, int *fN, int *M, int *P);
+int gk104_pll_calc_hiclk(int target_khz, int crystal,
+		int *N1, int *fN1, int *M1, int *P1,
+		int *N2, int *M2, int *P2);
 #endif
diff --git a/drm/nouveau/nvkm/subdev/clk/pllgk104.c b/drm/nouveau/nvkm/subdev/clk/pllgk104.c
new file mode 100644
index 0000000..4863e64
--- /dev/null
+++ b/drm/nouveau/nvkm/subdev/clk/pllgk104.c
@@ -0,0 +1,77 @@
+/*
+ * Copyright 2015 Karol Herbst
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: Karol Herbst
+ */
+#include "pll.h"
+
+int
+gk104_pll_calc_hiclk(int target_khz, int crystal,
+		int *N1, int *fN1, int *M1, int *P1,
+		int *N2, int *M2, int *P2)
+{
+	int best_clk = 0, best_err = target_khz, p_ref, n_ref;
+	bool upper = false;
+
+	*M1 = 1;
+	/* M has to be 1, otherwise it gets unstable */
+	*M2 = 1;
+	/* can be 1 or 2, sticking with 1 for simplicity */
+	*P2 = 1;
+
+	for (p_ref = 0x7; p_ref >= 0x5; --p_ref) {
+		for (n_ref = 0x25; n_ref <= 0x2b; ++n_ref) {
+			int cur_N, cur_clk, cur_err;
+
+			cur_clk = calc_pll_clock(0, 1, n_ref, p_ref, crystal);
+			cur_N = target_khz / cur_clk;
+			cur_err = target_khz - calc_pll_clock(0xf000, 1, cur_N, 1, cur_clk);
+
+			/* we found a better combination */
+			if (cur_err < best_err) {
+				best_err = cur_err;
+				best_clk = cur_clk;
+				*N2 = cur_N;
+				*N1 = n_ref;
+				*P1 = p_ref;
+				upper = false;
+			}
+
+			cur_N += 1;
+			cur_err = calc_pll_clock(0xf000, 1, cur_N, 1, cur_clk) - target_khz;
+			if (cur_err < best_err) {
+				best_err = cur_err;
+				best_clk = cur_clk;
+				*N2 = cur_N;
+				*N1 = n_ref;
+				*P1 = p_ref;
+				upper = true;
+			}
+		}
+	}
+
+	/* adjust fN to get closer to the target clock */
+	*fN1 = (u16)((((best_err / *N2 * *P2) * (*P1 * *M1)) << 13) / crystal);
+	if (upper)
+		*fN1 = (u16)(1 - *fN1);
+
+	return calc_pll_clock(*fN1, 1, *N1, *P1, crystal);
+}
diff --git a/drm/nouveau/nvkm/subdev/clk/pllgt215.c b/drm/nouveau/nvkm/subdev/clk/pllgt215.c
index c6fccd6..fe9886c 100644
--- a/drm/nouveau/nvkm/subdev/clk/pllgt215.c
+++ b/drm/nouveau/nvkm/subdev/clk/pllgt215.c
@@ -27,6 +27,12 @@
 #include <subdev/bios/pll.h>
 
 int
+calc_pll_clock(int fN, int M, int N, int P, int clk)
+{
+	return ((clk * N) + (((u16)(fN + 4096) * clk) >> 13)) / (M * P);
+}
+
+int
 gt215_pll_calc(struct nvkm_subdev *subdev, struct nvbios_pll *info,
 	       u32 freq, int *pN, int *pfN, int *pM, int *P)
 {
diff --git a/drm/nouveau/nvkm/subdev/fb/ramgk104.c b/drm/nouveau/nvkm/subdev/fb/ramgk104.c
index 9893556..c4373bf 100644
--- a/drm/nouveau/nvkm/subdev/fb/ramgk104.c
+++ b/drm/nouveau/nvkm/subdev/fb/ramgk104.c
@@ -968,31 +968,24 @@ gk104_ram_calc_xits(struct gk104_ram *ram, struct nvkm_ram_data *next)
 	 * kepler boards, no idea how/why they're chosen.
 	 */
 	refclk = next->freq;
-	if (ram->mode == 2)
-		refclk = fuc->mempll.refclk;
-
-	/* calculate refpll coefficients */
-	ret = gt215_pll_calc(subdev, &fuc->refpll, refclk, &ram->N1,
-			     &ram->fN1, &ram->M1, &ram->P1);
-	fuc->mempll.refclk = ret;
-	if (ret <= 0) {
-		nvkm_error(subdev, "unable to calc refpll\n");
-		return -EINVAL;
-	}
-
-	/* calculate mempll coefficients, if we're using it */
 	if (ram->mode == 2) {
-		/* post-divider doesn't work... the reg takes the values but
-		 * appears to completely ignore it.  there *is* a bit at
-		 * bit 28 that appears to divide the clock by 2 if set.
-		 */
-		fuc->mempll.min_p = 1;
-		fuc->mempll.max_p = 2;
-
-		ret = gt215_pll_calc(subdev, &fuc->mempll, next->freq,
-				     &ram->N2, NULL, &ram->M2, &ram->P2);
+		ret = gk104_pll_calc_hiclk(next->freq, subdev->device->crystal,
+				&ram->N1, &ram->fN1, &ram->M1, &ram->P1,
+				&ram->N2, &ram->M2, &ram->P2);
+		fuc->mempll.refclk = ret;
+		if (ret <= 0) {
+			nvkm_error(subdev, "unable to calc plls\n");
+			return -EINVAL;
+		}
+		nvkm_debug(subdev, "sucessfully calced PLLs for clock %i kHz"
+				" (refclock: %i kHz)\n", next->freq, ret);
+	} else {
+		/* calculate refpll coefficients */
+		ret = gt215_pll_calc(subdev, &fuc->refpll, refclk, &ram->N1,
+				     &ram->fN1, &ram->M1, &ram->P1);
+		fuc->mempll.refclk = ret;
 		if (ret <= 0) {
-			nvkm_error(subdev, "unable to calc mempll\n");
+			nvkm_error(subdev, "unable to calc refpll\n");
 			return -EINVAL;
 		}
 	}
-- 
2.6.1

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] pll/gk104: fix PLL instability due to bad configuration with gddr5
       [not found]     ` <1444681109-380-2-git-send-email-nouveau-lIBOoy2+GI7scQ4cX5LuPg@public.gmane.org>
@ 2015-10-13 22:53       ` Ben Skeggs
  0 siblings, 0 replies; 4+ messages in thread
From: Ben Skeggs @ 2015-10-13 22:53 UTC (permalink / raw)
  To: Karol Herbst, nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 10/13/2015 06:18 AM, Karol Herbst wrote:
> this patch uses an approach closer to the nvidia driver to 
> configure both PLLs for high gddr5 memory clocks (usually above 
> 2400MHz)
> 
> previously nouveau used the one PLL as it was used for the lower 
> clocks and just adjusted the second PLL to get as close as possible
> to the requested clock. This means for my card, that I got a 4050
> MHz clock allthough 4008 MHz was requested.
> 
> now the driver iterates over a list of PLL configuration also used
>  by the nvidia driver and then adjust the second PLL to get near 
> the requested clock. Also it hold to some restriction I found while
> analyzing the PLL configurations
> 
> This won't fix all gddr5 high clock issues itself, but it should be
> fine on hybrid gpu systems as found on many laptops these days. 
> Also switching while normal desktop usage should be a lot more 
> stable than before.
Given that we really have no better ideas right now, I think the patch
is basically fine.

I think though, that given these PLLs are limited to FB, that you may
as well just stick the code in subdev/fb/ramgk104.c and be done with
it, there's no need for it to live in subdev/clk at all.

Ben.

> 
> Signed-off-by: Karol Herbst <nouveau@karolherbst.de> --- 
> drm/nouveau/nvkm/subdev/clk/Kbuild     |  1 + 
> drm/nouveau/nvkm/subdev/clk/pll.h      |  5 +++ 
> drm/nouveau/nvkm/subdev/clk/pllgk104.c | 77 
> ++++++++++++++++++++++++++++++++++ 
> drm/nouveau/nvkm/subdev/clk/pllgt215.c |  6 +++ 
> drm/nouveau/nvkm/subdev/fb/ramgk104.c  | 39 +++++++---------- 5 
> files changed, 105 insertions(+), 23 deletions(-) create mode 
> 100644 drm/nouveau/nvkm/subdev/clk/pllgk104.c
> 
> diff --git a/drm/nouveau/nvkm/subdev/clk/Kbuild 
> b/drm/nouveau/nvkm/subdev/clk/Kbuild index ed7717b..5ac23fe 100644 
> --- a/drm/nouveau/nvkm/subdev/clk/Kbuild +++ 
> b/drm/nouveau/nvkm/subdev/clk/Kbuild @@ -11,3 +11,4 @@ nvkm-y += 
> nvkm/subdev/clk/gk20a.o
> 
> nvkm-y += nvkm/subdev/clk/pllnv04.o nvkm-y += 
> nvkm/subdev/clk/pllgt215.o +nvkm-y += nvkm/subdev/clk/pllgk104.o 
> diff --git a/drm/nouveau/nvkm/subdev/clk/pll.h 
> b/drm/nouveau/nvkm/subdev/clk/pll.h index 44020a3..1f5e0a0 100644 
> --- a/drm/nouveau/nvkm/subdev/clk/pll.h +++ 
> b/drm/nouveau/nvkm/subdev/clk/pll.h @@ -4,8 +4,13 @@ struct 
> nvkm_subdev; struct nvbios_pll;
> 
> +int calc_pll_clock(int fN, int M, int N, int P, int clk); + int 
> nv04_pll_calc(struct nvkm_subdev *, struct nvbios_pll *, u32 freq, 
> int *N1, int *M1, int *N2, int *M2, int *P); int 
> gt215_pll_calc(struct nvkm_subdev *, struct nvbios_pll *, u32
> freq, int *N, int *fN, int *M, int *P); +int
> gk104_pll_calc_hiclk(int target_khz, int crystal, +		int *N1, int
> *fN1, int *M1, int *P1, + int *N2, int *M2, int *P2); #endif diff
> --git a/drm/nouveau/nvkm/subdev/clk/pllgk104.c 
> b/drm/nouveau/nvkm/subdev/clk/pllgk104.c new file mode 100644 index
> 0000000..4863e64 --- /dev/null +++ 
> b/drm/nouveau/nvkm/subdev/clk/pllgk104.c @@ -0,0 +1,77 @@ +/* + * 
> Copyright 2015 Karol Herbst + * + * Permission is hereby granted, 
> free of charge, to any person obtaining a + * copy of this software
> and associated documentation files (the "Software"), + * to deal in
> the Software without restriction, including without limitation + *
> the rights to use, copy, modify, merge, publish, distribute,
> sublicense, + * and/or sell copies of the Software, and to permit
> persons to whom the + * Software is furnished to do so, subject to
> the following conditions: + * + * The above copyright notice and
> this permission notice shall be included in + * all copies or
> substantial portions of the Software. + * + * THE SOFTWARE IS
> PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + *
> IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
> MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND 
> NONINFRINGEMENT.  IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR
>  AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY,
>  WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING 
> FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * 
> OTHER DEALINGS IN THE SOFTWARE. + * + * Authors: Karol Herbst + */ 
> +#include "pll.h" + +int +gk104_pll_calc_hiclk(int target_khz, int
> crystal, +		int *N1, int *fN1, int *M1, int *P1, +		int *N2, int
> *M2, int *P2) +{ +	int best_clk = 0, best_err = target_khz, p_ref,
> n_ref; +	bool upper = false; + +	*M1 = 1; +	/* M has to be 1,
> otherwise it gets unstable */ +	*M2 = 1; +	/* can be 1 or 2, 
> sticking with 1 for simplicity */ +	*P2 = 1; + +	for (p_ref = 0x7;
>  p_ref >= 0x5; --p_ref) { +		for (n_ref = 0x25; n_ref <= 0x2b; 
> ++n_ref) { +			int cur_N, cur_clk, cur_err; + +			cur_clk = 
> calc_pll_clock(0, 1, n_ref, p_ref, crystal); +			cur_N = target_khz
> / cur_clk; +			cur_err = target_khz - calc_pll_clock(0xf000, 1,
> cur_N, 1, cur_clk); + +			/* we found a better combination */ + if
> (cur_err < best_err) { +				best_err = cur_err; +				best_clk =
> cur_clk; +				*N2 = cur_N; +				*N1 = n_ref; +				*P1 = p_ref; +
> upper = false; +			} + +			cur_N += 1; + cur_err =
> calc_pll_clock(0xf000, 1, cur_N, 1, cur_clk) - target_khz; +			if
> (cur_err < best_err) { +				best_err = cur_err; +				best_clk =
> cur_clk; +				*N2 = cur_N; +				*N1 = n_ref; + *P1 = p_ref; + upper
> = true; +			} +		} +	} + +	/* adjust fN to get closer to the target
> clock */ +	*fN1 = (u16)((((best_err / *N2 * *P2) * (*P1 * *M1)) <<
> 13) / crystal); +	if (upper) +		*fN1 = (u16)(1 - *fN1); + +	return
> calc_pll_clock(*fN1, 1, *N1, *P1, crystal); +} diff --git
> a/drm/nouveau/nvkm/subdev/clk/pllgt215.c 
> b/drm/nouveau/nvkm/subdev/clk/pllgt215.c index c6fccd6..fe9886c 
> 100644 --- a/drm/nouveau/nvkm/subdev/clk/pllgt215.c +++ 
> b/drm/nouveau/nvkm/subdev/clk/pllgt215.c @@ -27,6 +27,12 @@ 
> #include <subdev/bios/pll.h>
> 
> int +calc_pll_clock(int fN, int M, int N, int P, int clk) +{ + 
> return ((clk * N) + (((u16)(fN + 4096) * clk) >> 13)) / (M * P);
> +} + +int gt215_pll_calc(struct nvkm_subdev *subdev, struct
> nvbios_pll *info, u32 freq, int *pN, int *pfN, int *pM, int *P) {
> diff --git a/drm/nouveau/nvkm/subdev/fb/ramgk104.c 
> b/drm/nouveau/nvkm/subdev/fb/ramgk104.c index 9893556..c4373bf 
> 100644 --- a/drm/nouveau/nvkm/subdev/fb/ramgk104.c +++ 
> b/drm/nouveau/nvkm/subdev/fb/ramgk104.c @@ -968,31 +968,24 @@ 
> gk104_ram_calc_xits(struct gk104_ram *ram, struct nvkm_ram_data 
> *next) * kepler boards, no idea how/why they're chosen. */ refclk =
> next->freq; -	if (ram->mode == 2) -		refclk = fuc->mempll.refclk; -
> -	/* calculate refpll coefficients */ -	ret = 
> gt215_pll_calc(subdev, &fuc->refpll, refclk, &ram->N1, -
> &ram->fN1, &ram->M1, &ram->P1); -	fuc->mempll.refclk = ret; -	if
> (ret <= 0) { -		nvkm_error(subdev, "unable to calc refpll\n"); -
> return -EINVAL; -	} - -	/* calculate mempll coefficients, if we're
> using it */ if (ram->mode == 2) { -		/* post-divider doesn't
> work... the reg takes the values but -		 * appears to completely
> ignore it.  there *is* a bit at -		 * bit 28 that appears to divide
> the clock by 2 if set. - */ -		fuc->mempll.min_p = 1; -
> fuc->mempll.max_p = 2; - -		ret = gt215_pll_calc(subdev,
> &fuc->mempll, next->freq, - &ram->N2, NULL, &ram->M2, &ram->P2); +
> ret = gk104_pll_calc_hiclk(next->freq, subdev->device->crystal, + 
> &ram->N1, &ram->fN1, &ram->M1, &ram->P1, +				&ram->N2, &ram->M2, 
> &ram->P2); +		fuc->mempll.refclk = ret; +		if (ret <= 0) { + 
> nvkm_error(subdev, "unable to calc plls\n"); +			return -EINVAL; + 
> } +		nvkm_debug(subdev, "sucessfully calced PLLs for clock %i kHz" 
> +				" (refclock: %i kHz)\n", next->freq, ret); +	} else { +		/* 
> calculate refpll coefficients */ +		ret = gt215_pll_calc(subdev, 
> &fuc->refpll, refclk, &ram->N1, +				     &ram->fN1, &ram->M1, 
> &ram->P1); + fuc->mempll.refclk = ret; if (ret <= 0) { - 
> nvkm_error(subdev, "unable to calc mempll\n"); + nvkm_error(subdev,
> "unable to calc refpll\n"); return -EINVAL; } }
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAEBCAAGBQJWHYt7AAoJEHYLnGJQkpH7f+0P/3Tc7KO87UBovvZTXh/cxQXp
0sXh3nWd/M8mBokqHPFFKTuX5ci8qJNlRBKnQp9n0GKGe/tlIronnJLoV3hdFZun
Sh6JW96SyomW7FCz2ZxppE7sasvcfcEclyg+suMtiCzFUnbu8xYRmIAgf8QC3GUk
PhcD5JfZsUbExA9Q9FXAmiI6s7PYH5GPz5Zdu1I4t+Dihuvsh8ztEkdKiRTZjCfn
p7yVq1+td0KO011dFWioRfQszLzV7luftPCHoBeZN2b2jb32v9HaQHuQ91GlY0W/
zNO1zgUQpxbb3cbkBmhT8lbQiOr160MPqeKm3lu9fs2MrXKg6x+8iPisB6hTSrh6
lIW8WE9SwIlqAVbzd4mbuVa1NngYtXBK4LGi9hTfbUMbZ6hBkuoVIWNwFMxRPdkq
vkYRawoYOiJzVdD4/S1gUC1GrDihZffhxHKG+92Z4o4ktEemvxH0vE7uM9v7XnHz
U2UByrOnFpPabMWroLIwAkei/QKK7c7dvKCP9q2zDiF0BXlqaxO9Pg7eVWXl5l0o
2WKXpUghkklz74RzNe+205jUfcjwAsEMBQIc9+73wcGFAnLy/fh/dnJIrIyyrSnf
DGUbFzGXr6k4/OBGH5b00Fcq18UYrbty2OVXu1qf5bZf0egod8D751QPnQweMHME
cP30MH0ZEVPWGchbbq28
=EzjZ
-----END PGP SIGNATURE-----
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] pll/gk104: fix PLL instability due to bad configuration with gddr5
       [not found] ` <1444680774-7935-1-git-send-email-nouveau-lIBOoy2+GI7scQ4cX5LuPg@public.gmane.org>
@ 2015-10-12 20:12   ` Karol Herbst
  0 siblings, 0 replies; 4+ messages in thread
From: Karol Herbst @ 2015-10-12 20:12 UTC (permalink / raw)
  To: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

this patch uses an approach closer to the nvidia driver to configure both PLLs
for high gddr5 memory clocks (usually above 2400MHz)

previously nouveau used the one PLL as it was used for the lower clocks and just
adjusted the second PLL to get as close as possible to the requested clock.
This means for my card, that I got a 4050 MHz clock allthough 4008 MHz was
requested.

now the driver iterates over a list of PLL configuration also used by the
nvidia driver and then adjust the second PLL to get near the requested clock.
Also it hold to some restriction I found while analyzing the PLL configurations

This won't fix all gddr5 high clock issues itself, but it should be fine on
hybrid gpu systems as found on many laptops these days. Also switching while
normal desktop usage should be a lot more stable than before.

Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
---
 drm/nouveau/nvkm/subdev/clk/Kbuild     |  1 +
 drm/nouveau/nvkm/subdev/clk/pll.h      |  5 +++
 drm/nouveau/nvkm/subdev/clk/pllgk104.c | 77 ++++++++++++++++++++++++++++++++++
 drm/nouveau/nvkm/subdev/clk/pllgt215.c |  6 +++
 drm/nouveau/nvkm/subdev/fb/ramgk104.c  | 39 +++++++----------
 5 files changed, 105 insertions(+), 23 deletions(-)
 create mode 100644 drm/nouveau/nvkm/subdev/clk/pllgk104.c

diff --git a/drm/nouveau/nvkm/subdev/clk/Kbuild b/drm/nouveau/nvkm/subdev/clk/Kbuild
index ed7717b..5ac23fe 100644
--- a/drm/nouveau/nvkm/subdev/clk/Kbuild
+++ b/drm/nouveau/nvkm/subdev/clk/Kbuild
@@ -11,3 +11,4 @@ nvkm-y += nvkm/subdev/clk/gk20a.o
 
 nvkm-y += nvkm/subdev/clk/pllnv04.o
 nvkm-y += nvkm/subdev/clk/pllgt215.o
+nvkm-y += nvkm/subdev/clk/pllgk104.o
diff --git a/drm/nouveau/nvkm/subdev/clk/pll.h b/drm/nouveau/nvkm/subdev/clk/pll.h
index 44020a3..1f5e0a0 100644
--- a/drm/nouveau/nvkm/subdev/clk/pll.h
+++ b/drm/nouveau/nvkm/subdev/clk/pll.h
@@ -4,8 +4,13 @@
 struct nvkm_subdev;
 struct nvbios_pll;
 
+int calc_pll_clock(int fN, int M, int N, int P, int clk);
+
 int nv04_pll_calc(struct nvkm_subdev *, struct nvbios_pll *, u32 freq,
 		  int *N1, int *M1, int *N2, int *M2, int *P);
 int gt215_pll_calc(struct nvkm_subdev *, struct nvbios_pll *, u32 freq,
 		  int *N, int *fN, int *M, int *P);
+int gk104_pll_calc_hiclk(int target_khz, int crystal,
+		int *N1, int *fN1, int *M1, int *P1,
+		int *N2, int *M2, int *P2);
 #endif
diff --git a/drm/nouveau/nvkm/subdev/clk/pllgk104.c b/drm/nouveau/nvkm/subdev/clk/pllgk104.c
new file mode 100644
index 0000000..4863e64
--- /dev/null
+++ b/drm/nouveau/nvkm/subdev/clk/pllgk104.c
@@ -0,0 +1,77 @@
+/*
+ * Copyright 2015 Karol Herbst
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: Karol Herbst
+ */
+#include "pll.h"
+
+int
+gk104_pll_calc_hiclk(int target_khz, int crystal,
+		int *N1, int *fN1, int *M1, int *P1,
+		int *N2, int *M2, int *P2)
+{
+	int best_clk = 0, best_err = target_khz, p_ref, n_ref;
+	bool upper = false;
+
+	*M1 = 1;
+	/* M has to be 1, otherwise it gets unstable */
+	*M2 = 1;
+	/* can be 1 or 2, sticking with 1 for simplicity */
+	*P2 = 1;
+
+	for (p_ref = 0x7; p_ref >= 0x5; --p_ref) {
+		for (n_ref = 0x25; n_ref <= 0x2b; ++n_ref) {
+			int cur_N, cur_clk, cur_err;
+
+			cur_clk = calc_pll_clock(0, 1, n_ref, p_ref, crystal);
+			cur_N = target_khz / cur_clk;
+			cur_err = target_khz - calc_pll_clock(0xf000, 1, cur_N, 1, cur_clk);
+
+			/* we found a better combination */
+			if (cur_err < best_err) {
+				best_err = cur_err;
+				best_clk = cur_clk;
+				*N2 = cur_N;
+				*N1 = n_ref;
+				*P1 = p_ref;
+				upper = false;
+			}
+
+			cur_N += 1;
+			cur_err = calc_pll_clock(0xf000, 1, cur_N, 1, cur_clk) - target_khz;
+			if (cur_err < best_err) {
+				best_err = cur_err;
+				best_clk = cur_clk;
+				*N2 = cur_N;
+				*N1 = n_ref;
+				*P1 = p_ref;
+				upper = true;
+			}
+		}
+	}
+
+	/* adjust fN to get closer to the target clock */
+	*fN1 = (u16)((((best_err / *N2 * *P2) * (*P1 * *M1)) << 13) / crystal);
+	if (upper)
+		*fN1 = (u16)(1 - *fN1);
+
+	return calc_pll_clock(*fN1, 1, *N1, *P1, crystal);
+}
diff --git a/drm/nouveau/nvkm/subdev/clk/pllgt215.c b/drm/nouveau/nvkm/subdev/clk/pllgt215.c
index c6fccd6..fe9886c 100644
--- a/drm/nouveau/nvkm/subdev/clk/pllgt215.c
+++ b/drm/nouveau/nvkm/subdev/clk/pllgt215.c
@@ -27,6 +27,12 @@
 #include <subdev/bios/pll.h>
 
 int
+calc_pll_clock(int fN, int M, int N, int P, int clk)
+{
+	return ((clk * N) + (((u16)(fN + 4096) * clk) >> 13)) / (M * P);
+}
+
+int
 gt215_pll_calc(struct nvkm_subdev *subdev, struct nvbios_pll *info,
 	       u32 freq, int *pN, int *pfN, int *pM, int *P)
 {
diff --git a/drm/nouveau/nvkm/subdev/fb/ramgk104.c b/drm/nouveau/nvkm/subdev/fb/ramgk104.c
index 9893556..c4373bf 100644
--- a/drm/nouveau/nvkm/subdev/fb/ramgk104.c
+++ b/drm/nouveau/nvkm/subdev/fb/ramgk104.c
@@ -968,31 +968,24 @@ gk104_ram_calc_xits(struct gk104_ram *ram, struct nvkm_ram_data *next)
 	 * kepler boards, no idea how/why they're chosen.
 	 */
 	refclk = next->freq;
-	if (ram->mode == 2)
-		refclk = fuc->mempll.refclk;
-
-	/* calculate refpll coefficients */
-	ret = gt215_pll_calc(subdev, &fuc->refpll, refclk, &ram->N1,
-			     &ram->fN1, &ram->M1, &ram->P1);
-	fuc->mempll.refclk = ret;
-	if (ret <= 0) {
-		nvkm_error(subdev, "unable to calc refpll\n");
-		return -EINVAL;
-	}
-
-	/* calculate mempll coefficients, if we're using it */
 	if (ram->mode == 2) {
-		/* post-divider doesn't work... the reg takes the values but
-		 * appears to completely ignore it.  there *is* a bit at
-		 * bit 28 that appears to divide the clock by 2 if set.
-		 */
-		fuc->mempll.min_p = 1;
-		fuc->mempll.max_p = 2;
-
-		ret = gt215_pll_calc(subdev, &fuc->mempll, next->freq,
-				     &ram->N2, NULL, &ram->M2, &ram->P2);
+		ret = gk104_pll_calc_hiclk(next->freq, subdev->device->crystal,
+				&ram->N1, &ram->fN1, &ram->M1, &ram->P1,
+				&ram->N2, &ram->M2, &ram->P2);
+		fuc->mempll.refclk = ret;
+		if (ret <= 0) {
+			nvkm_error(subdev, "unable to calc plls\n");
+			return -EINVAL;
+		}
+		nvkm_debug(subdev, "sucessfully calced PLLs for clock %i kHz"
+				" (refclock: %i kHz)\n", next->freq, ret);
+	} else {
+		/* calculate refpll coefficients */
+		ret = gt215_pll_calc(subdev, &fuc->refpll, refclk, &ram->N1,
+				     &ram->fN1, &ram->M1, &ram->P1);
+		fuc->mempll.refclk = ret;
 		if (ret <= 0) {
-			nvkm_error(subdev, "unable to calc mempll\n");
+			nvkm_error(subdev, "unable to calc refpll\n");
 			return -EINVAL;
 		}
 	}
-- 
2.6.1

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-10-13 22:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-12 20:18 fixing GDDR5 reclocking on kepler cards Karol Herbst
     [not found] ` <1444681109-380-1-git-send-email-nouveau-lIBOoy2+GI7scQ4cX5LuPg@public.gmane.org>
2015-10-12 20:18   ` [PATCH] pll/gk104: fix PLL instability due to bad configuration with gddr5 Karol Herbst
     [not found]     ` <1444681109-380-2-git-send-email-nouveau-lIBOoy2+GI7scQ4cX5LuPg@public.gmane.org>
2015-10-13 22:53       ` Ben Skeggs
  -- strict thread matches above, loose matches on Subject: below --
2015-10-12 20:12 fixing GDDR5 reclocking on kepler cards Karol Herbst
     [not found] ` <1444680774-7935-1-git-send-email-nouveau-lIBOoy2+GI7scQ4cX5LuPg@public.gmane.org>
2015-10-12 20:12   ` [PATCH] pll/gk104: fix PLL instability due to bad configuration with gddr5 Karol Herbst

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.