All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/6] Support large pattern buffers
@ 2022-11-18 23:15 Logan Gunthorpe
  2022-11-18 23:15 ` [PATCH v3 1/6] cconv: Support pattern buffers of arbitrary size Logan Gunthorpe
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Logan Gunthorpe @ 2022-11-18 23:15 UTC (permalink / raw)
  To: fio, Jens Axboe, Vincent Fu; +Cc: David Sloan, Stephen Bates, Logan Gunthorpe

Hello,

This is v3 of this patchset which includes changes for the feedback
thus far.

When testing storage platforms that have integrated compression it is
useful to use specifically compressible real world data. The existing
buffer_compress_percentage option doesn't work well for testing the
performance of the compression algorithm portion of the stack seeing it
results in random data (which can slow down many compression algorithms)
followed by trivially compressible data (which can often be compressed
at a higher performance than real world data).

Instead of the crude compressibility approximation when using
buffer_compress_percentage, it would be useful to specify specific test
data for benchmarking purposes. Using real world data or well-known
test data like the Calgary Corpus would help to make the benchmarks
closer to real world applications.

The existing buffer_pattern option would almost work for this, except it
is limited to a maximum length of 512B and will be repeated in the
IO buffer and thus can't be used for any arbitrary compression
size seeing repeated data is easily compressed.

This patch set proposes allowing much larger buffer patterns,
that may be loaded from a file. The first patch provides the
infrastructure for passing arbitrary sized buffers in the client/server
protocol (which means the server version number must be bumped), the
second patch adds a method for pre-calculating the buffer size
in parse_and_fill_pattern(), the third patch supports properly
reading large files and the final patch enables dynamic allocation of the
buffers themselves.

The buffer_pattern will still be truncated to the block size of
the IO and repeated for every IO, but this should still be acceptable
for testing compression in storage which tends to operate on
fixed size chunks anyway.

An arbitrary limit on the pattern buffer of 128MB is still enforced to
avoid issues that would crop up with the maximum limit of the
client/server commands (FIO_SERVER_MAX_CMD_MB). This should be more than
enough seeing it isn't useful to specify a buffer greater than the block
size and blocksizes that large are not very practical.

Thanks,

Logan

--

Changes since v2:
  - Fix missing signed off by on patch
  - Add a patch to properly support binary patterns on windows
    per Vincent and the CI run of the new test

Changes since v1:
  - Added comment on the patterns member of struct thread_options_pack,
    per Vincent
  - Added a test in the suite to test a 16KB buffer and verify pattern

--

Logan Gunthorpe (6):
  cconv: Support pattern buffers of arbitrary size
  lib/pattern: Support NULL output buffer in parse_and_fill_pattern()
  lib/pattern: Support short repeated read calls when loading from file
  options: Support arbitrarily long pattern buffers
  lib/pattern: Support binary pattern buffers on windows
  test: add large pattern test

 cconv.c            |  86 +++++++++++++++++++++++++++-----------
 client.c           |  17 +++++---
 gclient.c          |  12 ++++--
 lib/pattern.c      | 100 +++++++++++++++++++++++++++++++++++++--------
 lib/pattern.h      |  21 +++++++---
 options.c          |  10 ++---
 server.c           |  23 +++++++----
 server.h           |   2 +-
 stat.h             |   1 -
 t/jobs/t0027.fio   |  14 +++++++
 t/run-fio-tests.py |  29 +++++++++++++
 thread_options.h   |  15 ++++---
 12 files changed, 257 insertions(+), 73 deletions(-)
 create mode 100644 t/jobs/t0027.fio


base-commit: 07c8fe21021681f86fbfd3c3d63b88a5ebd4e557
--
2.30.2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 1/6] cconv: Support pattern buffers of arbitrary size
  2022-11-18 23:15 [PATCH v3 0/6] Support large pattern buffers Logan Gunthorpe
@ 2022-11-18 23:15 ` Logan Gunthorpe
  2022-11-18 23:15 ` [PATCH v3 2/6] lib/pattern: Support NULL output buffer in parse_and_fill_pattern() Logan Gunthorpe
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Logan Gunthorpe @ 2022-11-18 23:15 UTC (permalink / raw)
  To: fio, Jens Axboe, Vincent Fu; +Cc: David Sloan, Stephen Bates, Logan Gunthorpe

Change the thread_options_pack structure to support pattern buffers
of arbitrary size by using a flexible array at the end of the the
structure to store both the verify_pattern and the buffer_pattern
in that order.

In this way, only the actual bytes of each pattern will be sent over
the wire and patterns of an arbitrary size can be used with the packed
structure.

In order to determine the required size of the structure the function
thread_options_pack_size() is introduced which returns the total
number of bytes required for a given thread_options instance.

The two callsites of convert_thread_options_to_net() are then converted
to dynamically allocate a pdu of the appropriate size and the
two callsites of convert_thread_options_to_cpu() are modified to
take the size of the received data to prevent buffer overruns.

Also add specific testing of this feature in fio_test_cconv().

Seeing this changes the client/server protocol, the FIO_SERVER_VER
is bumped.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 cconv.c          | 75 +++++++++++++++++++++++++++++++++---------------
 client.c         | 17 +++++++----
 gclient.c        | 12 ++++++--
 server.c         | 23 +++++++++------
 server.h         |  2 +-
 thread_options.h | 11 +++++--
 6 files changed, 96 insertions(+), 44 deletions(-)

diff --git a/cconv.c b/cconv.c
index 6c36afb72d1e..bb1af4970e97 100644
--- a/cconv.c
+++ b/cconv.c
@@ -54,8 +54,15 @@ static void free_thread_options_to_cpu(struct thread_options *o)
 	}
 }
 
-void convert_thread_options_to_cpu(struct thread_options *o,
-				   struct thread_options_pack *top)
+size_t thread_options_pack_size(struct thread_options *o)
+{
+	return sizeof(struct thread_options_pack) + o->verify_pattern_bytes +
+		o->buffer_pattern_bytes;
+}
+
+int convert_thread_options_to_cpu(struct thread_options *o,
+				  struct thread_options_pack *top,
+				  size_t top_sz)
 {
 	int i, j;
 
@@ -171,10 +178,17 @@ void convert_thread_options_to_cpu(struct thread_options *o,
 	o->verify_interval = le32_to_cpu(top->verify_interval);
 	o->verify_offset = le32_to_cpu(top->verify_offset);
 
-	memcpy(o->verify_pattern, top->verify_pattern, MAX_PATTERN_SIZE);
-	memcpy(o->buffer_pattern, top->buffer_pattern, MAX_PATTERN_SIZE);
-
 	o->verify_pattern_bytes = le32_to_cpu(top->verify_pattern_bytes);
+	o->buffer_pattern_bytes = le32_to_cpu(top->buffer_pattern_bytes);
+	if (o->verify_pattern_bytes >= MAX_PATTERN_SIZE ||
+	    o->buffer_pattern_bytes >= MAX_PATTERN_SIZE ||
+	    thread_options_pack_size(o) > top_sz)
+		return -EINVAL;
+
+	memcpy(o->verify_pattern, top->patterns, o->verify_pattern_bytes);
+	memcpy(o->buffer_pattern, &top->patterns[o->verify_pattern_bytes],
+	       o->buffer_pattern_bytes);
+
 	o->verify_fatal = le32_to_cpu(top->verify_fatal);
 	o->verify_dump = le32_to_cpu(top->verify_dump);
 	o->verify_async = le32_to_cpu(top->verify_async);
@@ -268,7 +282,6 @@ void convert_thread_options_to_cpu(struct thread_options *o,
 	o->zero_buffers = le32_to_cpu(top->zero_buffers);
 	o->refill_buffers = le32_to_cpu(top->refill_buffers);
 	o->scramble_buffers = le32_to_cpu(top->scramble_buffers);
-	o->buffer_pattern_bytes = le32_to_cpu(top->buffer_pattern_bytes);
 	o->time_based = le32_to_cpu(top->time_based);
 	o->disable_lat = le32_to_cpu(top->disable_lat);
 	o->disable_clat = le32_to_cpu(top->disable_clat);
@@ -334,6 +347,8 @@ void convert_thread_options_to_cpu(struct thread_options *o,
 	uint8_t verify_cpumask[FIO_TOP_STR_MAX];
 	uint8_t log_gz_cpumask[FIO_TOP_STR_MAX];
 #endif
+
+	return 0;
 }
 
 void convert_thread_options_to_net(struct thread_options_pack *top,
@@ -572,8 +587,9 @@ void convert_thread_options_to_net(struct thread_options_pack *top,
 		top->max_latency[i] = __cpu_to_le64(o->max_latency[i]);
 	}
 
-	memcpy(top->verify_pattern, o->verify_pattern, MAX_PATTERN_SIZE);
-	memcpy(top->buffer_pattern, o->buffer_pattern, MAX_PATTERN_SIZE);
+	memcpy(top->patterns, o->verify_pattern, o->verify_pattern_bytes);
+	memcpy(&top->patterns[o->verify_pattern_bytes], o->buffer_pattern,
+	       o->buffer_pattern_bytes);
 
 	top->size = __cpu_to_le64(o->size);
 	top->io_size = __cpu_to_le64(o->io_size);
@@ -620,7 +636,6 @@ void convert_thread_options_to_net(struct thread_options_pack *top,
 	uint8_t verify_cpumask[FIO_TOP_STR_MAX];
 	uint8_t log_gz_cpumask[FIO_TOP_STR_MAX];
 #endif
-
 }
 
 /*
@@ -630,18 +645,32 @@ void convert_thread_options_to_net(struct thread_options_pack *top,
  */
 int fio_test_cconv(struct thread_options *__o)
 {
-	struct thread_options o;
-	struct thread_options_pack top1, top2;
-
-	memset(&top1, 0, sizeof(top1));
-	memset(&top2, 0, sizeof(top2));
-
-	convert_thread_options_to_net(&top1, __o);
-	memset(&o, 0, sizeof(o));
-	convert_thread_options_to_cpu(&o, &top1);
-	convert_thread_options_to_net(&top2, &o);
-
-	free_thread_options_to_cpu(&o);
-
-	return memcmp(&top1, &top2, sizeof(top1));
+	struct thread_options o1 = *__o, o2;
+	struct thread_options_pack *top1, *top2;
+	size_t top_sz;
+	int ret;
+
+	o1.verify_pattern_bytes = 61;
+	memset(o1.verify_pattern, 'V', o1.verify_pattern_bytes);
+	o1.buffer_pattern_bytes = 15;
+	memset(o1.buffer_pattern, 'B', o1.buffer_pattern_bytes);
+
+	top_sz = thread_options_pack_size(&o1);
+	top1 = calloc(1, top_sz);
+	top2 = calloc(1, top_sz);
+
+	convert_thread_options_to_net(top1, &o1);
+	memset(&o2, 0, sizeof(o2));
+	ret = convert_thread_options_to_cpu(&o2, top1, top_sz);
+	if (ret)
+		goto out;
+
+	convert_thread_options_to_net(top2, &o2);
+	ret = memcmp(top1, top2, top_sz);
+
+out:
+	free_thread_options_to_cpu(&o2);
+	free(top2);
+	free(top1);
+	return ret;
 }
diff --git a/client.c b/client.c
index 37da74bca5d7..51496c770248 100644
--- a/client.c
+++ b/client.c
@@ -922,13 +922,20 @@ int fio_clients_send_ini(const char *filename)
 int fio_client_update_options(struct fio_client *client,
 			      struct thread_options *o, uint64_t *tag)
 {
-	struct cmd_add_job_pdu pdu;
+	size_t cmd_sz = offsetof(struct cmd_add_job_pdu, top) +
+		thread_options_pack_size(o);
+	struct cmd_add_job_pdu *pdu;
+	int ret;
 
-	pdu.thread_number = cpu_to_le32(client->thread_number);
-	pdu.groupid = cpu_to_le32(client->groupid);
-	convert_thread_options_to_net(&pdu.top, o);
+	pdu = malloc(cmd_sz);
+	pdu->thread_number = cpu_to_le32(client->thread_number);
+	pdu->groupid = cpu_to_le32(client->groupid);
+	convert_thread_options_to_net(&pdu->top, o);
 
-	return fio_net_send_cmd(client->fd, FIO_NET_CMD_UPDATE_JOB, &pdu, sizeof(pdu), tag, &client->cmd_list);
+	ret = fio_net_send_cmd(client->fd, FIO_NET_CMD_UPDATE_JOB, pdu,
+			       cmd_sz, tag, &client->cmd_list);
+	free(pdu);
+	return ret;
 }
 
 static void convert_io_stat(struct io_stat *dst, struct io_stat *src)
diff --git a/gclient.c b/gclient.c
index c59bcfe2f6f3..73f64b3b87f1 100644
--- a/gclient.c
+++ b/gclient.c
@@ -553,12 +553,15 @@ static void gfio_quit_op(struct fio_client *client, struct fio_net_cmd *cmd)
 }
 
 static struct thread_options *gfio_client_add_job(struct gfio_client *gc,
-			struct thread_options_pack *top)
+			struct thread_options_pack *top, size_t top_sz)
 {
 	struct gfio_client_options *gco;
 
 	gco = calloc(1, sizeof(*gco));
-	convert_thread_options_to_cpu(&gco->o, top);
+	if (convert_thread_options_to_cpu(&gco->o, top, top_sz)) {
+		dprint(FD_NET, "client: failed parsing add_job command\n");
+		return NULL;
+	}
 	INIT_FLIST_HEAD(&gco->list);
 	flist_add_tail(&gco->list, &gc->o_list);
 	gc->o_list_nr = 1;
@@ -577,7 +580,10 @@ static void gfio_add_job_op(struct fio_client *client, struct fio_net_cmd *cmd)
 
 	p->thread_number = le32_to_cpu(p->thread_number);
 	p->groupid = le32_to_cpu(p->groupid);
-	o = gfio_client_add_job(gc, &p->top);
+	o = gfio_client_add_job(gc, &p->top,
+			cmd->pdu_len - offsetof(struct cmd_add_job_pdu, top));
+	if (o == NULL)
+		return;
 
 	gdk_threads_enter();
 
diff --git a/server.c b/server.c
index b869d3872734..a6347efd8281 100644
--- a/server.c
+++ b/server.c
@@ -1082,6 +1082,7 @@ static int handle_update_job_cmd(struct fio_net_cmd *cmd)
 	struct cmd_add_job_pdu *pdu = (struct cmd_add_job_pdu *) cmd->payload;
 	struct thread_data *td;
 	uint32_t tnumber;
+	int ret;
 
 	tnumber = le32_to_cpu(pdu->thread_number);
 
@@ -1093,8 +1094,9 @@ static int handle_update_job_cmd(struct fio_net_cmd *cmd)
 	}
 
 	td = tnumber_to_td(tnumber);
-	convert_thread_options_to_cpu(&td->o, &pdu->top);
-	send_update_job_reply(cmd->tag, 0);
+	ret = convert_thread_options_to_cpu(&td->o, &pdu->top,
+			cmd->pdu_len - offsetof(struct cmd_add_job_pdu, top));
+	send_update_job_reply(cmd->tag, ret);
 	return 0;
 }
 
@@ -2323,15 +2325,18 @@ int fio_send_iolog(struct thread_data *td, struct io_log *log, const char *name)
 
 void fio_server_send_add_job(struct thread_data *td)
 {
-	struct cmd_add_job_pdu pdu = {
-		.thread_number = cpu_to_le32(td->thread_number),
-		.groupid = cpu_to_le32(td->groupid),
-	};
+	struct cmd_add_job_pdu *pdu;
+	size_t cmd_sz = offsetof(struct cmd_add_job_pdu, top) +
+		thread_options_pack_size(&td->o);
 
-	convert_thread_options_to_net(&pdu.top, &td->o);
+	pdu = malloc(cmd_sz);
+	pdu->thread_number = cpu_to_le32(td->thread_number);
+	pdu->groupid = cpu_to_le32(td->groupid);
 
-	fio_net_queue_cmd(FIO_NET_CMD_ADD_JOB, &pdu, sizeof(pdu), NULL,
-				SK_F_COPY);
+	convert_thread_options_to_net(&pdu->top, &td->o);
+
+	fio_net_queue_cmd(FIO_NET_CMD_ADD_JOB, pdu, cmd_sz, NULL, SK_F_COPY);
+	free(pdu);
 }
 
 void fio_server_send_start(struct thread_data *td)
diff --git a/server.h b/server.h
index b0c5e2dfafa0..281330202884 100644
--- a/server.h
+++ b/server.h
@@ -51,7 +51,7 @@ struct fio_net_cmd_reply {
 };
 
 enum {
-	FIO_SERVER_VER			= 97,
+	FIO_SERVER_VER			= 98,
 
 	FIO_SERVER_MAX_FRAGMENT_PDU	= 1024,
 	FIO_SERVER_MAX_CMD_MB		= 2048,
diff --git a/thread_options.h b/thread_options.h
index 634070af00ec..d2897ac274bf 100644
--- a/thread_options.h
+++ b/thread_options.h
@@ -464,7 +464,6 @@ struct thread_options_pack {
 	uint32_t do_verify;
 	uint32_t verify_interval;
 	uint32_t verify_offset;
-	uint8_t verify_pattern[MAX_PATTERN_SIZE];
 	uint32_t verify_pattern_bytes;
 	uint32_t verify_fatal;
 	uint32_t verify_dump;
@@ -572,7 +571,6 @@ struct thread_options_pack {
 	uint32_t zero_buffers;
 	uint32_t refill_buffers;
 	uint32_t scramble_buffers;
-	uint8_t buffer_pattern[MAX_PATTERN_SIZE];
 	uint32_t buffer_pattern_bytes;
 	uint32_t compress_percentage;
 	uint32_t compress_chunk;
@@ -699,9 +697,16 @@ struct thread_options_pack {
 
 	uint32_t log_entries;
 	uint32_t log_prio;
+
+	/*
+	 * verify_pattern followed by buffer_pattern from the unpacked struct
+	 */
+	uint8_t patterns[];
 } __attribute__((packed));
 
-extern void convert_thread_options_to_cpu(struct thread_options *o, struct thread_options_pack *top);
+extern int convert_thread_options_to_cpu(struct thread_options *o,
+		struct thread_options_pack *top, size_t top_sz);
+extern size_t thread_options_pack_size(struct thread_options *o);
 extern void convert_thread_options_to_net(struct thread_options_pack *top, struct thread_options *);
 extern int fio_test_cconv(struct thread_options *);
 extern void options_default_fill(struct thread_options *o);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 2/6] lib/pattern: Support NULL output buffer in parse_and_fill_pattern()
  2022-11-18 23:15 [PATCH v3 0/6] Support large pattern buffers Logan Gunthorpe
  2022-11-18 23:15 ` [PATCH v3 1/6] cconv: Support pattern buffers of arbitrary size Logan Gunthorpe
@ 2022-11-18 23:15 ` Logan Gunthorpe
  2022-11-18 23:15 ` [PATCH v3 3/6] lib/pattern: Support short repeated read calls when loading from file Logan Gunthorpe
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Logan Gunthorpe @ 2022-11-18 23:15 UTC (permalink / raw)
  To: fio, Jens Axboe, Vincent Fu; +Cc: David Sloan, Stephen Bates, Logan Gunthorpe

Support passing a NULL output buffer to parse_and_fill_pattern().
Each formatting function simply needs to avoid accessing the buffer
when it is NULL. This allows calculating the required size of the
buffer before one might be allocated.

This will be useful in a subsequent patch for dynamically allocating
the pattern buffers.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 lib/pattern.c | 35 +++++++++++++++++++++++++----------
 1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/lib/pattern.c b/lib/pattern.c
index d8203630d3ed..70d0313d237e 100644
--- a/lib/pattern.c
+++ b/lib/pattern.c
@@ -51,9 +51,17 @@ static const char *parse_file(const char *beg, char *out,
 	if (fd < 0)
 		goto err_free_out;
 
-	count = read(fd, out, out_len);
-	if (count == -1)
-		goto err_free_close_out;
+	if (out) {
+		count = read(fd, out, out_len);
+		if (count == -1)
+			goto err_free_close_out;
+	} else {
+		count = lseek(fd, 0, SEEK_END);
+		if (count == -1)
+			goto err_free_close_out;
+		if (count >= out_len)
+			count = out_len;
+	}
 
 	*filled = count;
 	close(fd);
@@ -100,7 +108,8 @@ static const char *parse_string(const char *beg, char *out,
 	if (end - beg > out_len)
 		return NULL;
 
-	memcpy(out, beg, end - beg);
+	if (out)
+		memcpy(out, beg, end - beg);
 	*filled = end - beg;
 
 	/* Catch up quote */
@@ -156,12 +165,14 @@ static const char *parse_number(const char *beg, char *out,
 		i = 0;
 		if (!lval) {
 			num    = 0;
-			out[i] = 0x00;
+			if (out)
+				out[i] = 0x00;
 			i      = 1;
 		} else {
 			val = (unsigned int)lval;
 			for (; val && out_len; out_len--, i++, val >>= 8)
-				out[i] = val & 0xff;
+				if (out)
+					out[i] = val & 0xff;
 			if (val)
 				return NULL;
 		}
@@ -183,7 +194,8 @@ static const char *parse_number(const char *beg, char *out,
 			const char *fmt;
 
 			fmt = (num & 1 ? "%1hhx" : "%2hhx");
-			sscanf(beg, fmt, &out[i]);
+			if (out)
+				sscanf(beg, fmt, &out[i]);
 			if (num & 1) {
 				num++;
 				beg--;
@@ -251,7 +263,8 @@ static const char *parse_format(const char *in, char *out, unsigned int parsed,
 	if (f->desc->len > out_len)
 		return NULL;
 
-	memset(out, '\0', f->desc->len);
+	if (out)
+		memset(out, '\0', f->desc->len);
 	*filled = f->desc->len;
 
 	return in + len;
@@ -262,7 +275,9 @@ static const char *parse_format(const char *in, char *out, unsigned int parsed,
  *                            numbers and pattern formats.
  * @in - string input
  * @in_len - size of the input string
- * @out - output buffer where parsed result will be put
+ * @out - output buffer where parsed result will be put, may be NULL
+ *	  in which case this function just calculates the required
+ *	  length of the buffer
  * @out_len - lengths of the output buffer
  * @fmt_desc - array of pattern format descriptors [input]
  * @fmt - array of pattern formats [output]
@@ -314,7 +329,7 @@ int parse_and_fill_pattern(const char *in, unsigned int in_len,
 	const char *beg, *end, *out_beg = out;
 	unsigned int total = 0, fmt_rem = 0;
 
-	if (!in || !in_len || !out || !out_len)
+	if (!in || !in_len || !out_len)
 		return -EINVAL;
 	if (fmt_sz_out)
 		fmt_rem = *fmt_sz_out;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 3/6] lib/pattern: Support short repeated read calls when loading from file
  2022-11-18 23:15 [PATCH v3 0/6] Support large pattern buffers Logan Gunthorpe
  2022-11-18 23:15 ` [PATCH v3 1/6] cconv: Support pattern buffers of arbitrary size Logan Gunthorpe
  2022-11-18 23:15 ` [PATCH v3 2/6] lib/pattern: Support NULL output buffer in parse_and_fill_pattern() Logan Gunthorpe
@ 2022-11-18 23:15 ` Logan Gunthorpe
  2022-11-18 23:15 ` [PATCH v3 4/6] options: Support arbitrarily long pattern buffers Logan Gunthorpe
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Logan Gunthorpe @ 2022-11-18 23:15 UTC (permalink / raw)
  To: fio, Jens Axboe, Vincent Fu; +Cc: David Sloan, Stephen Bates, Logan Gunthorpe

Once a pattern file can be much larger, it will be possible that
kernel will return a short read while loading the file and thus may
randomly only load part of the file.

Fix this by putting the read in a loop so the entire file will be
read.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 lib/pattern.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/lib/pattern.c b/lib/pattern.c
index 70d0313d237e..d324263c2b34 100644
--- a/lib/pattern.c
+++ b/lib/pattern.c
@@ -32,7 +32,7 @@ static const char *parse_file(const char *beg, char *out,
 	const char *end;
 	char *file;
 	int fd;
-	ssize_t count;
+	ssize_t rc, count = 0;
 
 	if (!out_len)
 		goto err_out;
@@ -52,9 +52,16 @@ static const char *parse_file(const char *beg, char *out,
 		goto err_free_out;
 
 	if (out) {
-		count = read(fd, out, out_len);
-		if (count == -1)
-			goto err_free_close_out;
+		while (1) {
+			rc = read(fd, out, out_len - count);
+			if (rc == 0)
+				break;
+			if (rc == -1)
+				goto err_free_close_out;
+
+			count += rc;
+			out += rc;
+		}
 	} else {
 		count = lseek(fd, 0, SEEK_END);
 		if (count == -1)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 4/6] options: Support arbitrarily long pattern buffers
  2022-11-18 23:15 [PATCH v3 0/6] Support large pattern buffers Logan Gunthorpe
                   ` (2 preceding siblings ...)
  2022-11-18 23:15 ` [PATCH v3 3/6] lib/pattern: Support short repeated read calls when loading from file Logan Gunthorpe
@ 2022-11-18 23:15 ` Logan Gunthorpe
  2022-11-18 23:16 ` [PATCH v3 5/6] lib/pattern: Support binary pattern buffers on windows Logan Gunthorpe
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Logan Gunthorpe @ 2022-11-18 23:15 UTC (permalink / raw)
  To: fio, Jens Axboe, Vincent Fu; +Cc: David Sloan, Stephen Bates, Logan Gunthorpe

Dynamically allocate the pattern buffer to remove the 512B length
restriction. To accomplish this, store a pointer instead of a fixed
block of memory for the buffers in the thread_options structure.
Then introduce and use the function parse_and_fill_pattern_alloc()
which will calculate the approprite size of the buffer and allocate
it before filling it.

The buffers will be freed, along with a number of string buffers
in free_thread_options_to_cpu(). They will also be reallocated (if
necessary) when receiving them over the wire with
convert_thread_options_to_cpu().

This allows for specifying real world compressible data (eg. The
Calgary Corpus) for the buffer_pattern option.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 cconv.c          | 11 ++++++++++
 lib/pattern.c    | 52 +++++++++++++++++++++++++++++++++++++++++++-----
 lib/pattern.h    | 21 ++++++++++++++-----
 options.c        | 10 +++++-----
 stat.h           |  1 -
 thread_options.h |  4 ++--
 6 files changed, 81 insertions(+), 18 deletions(-)

diff --git a/cconv.c b/cconv.c
index bb1af4970e97..d755844f51b0 100644
--- a/cconv.c
+++ b/cconv.c
@@ -48,6 +48,9 @@ static void free_thread_options_to_cpu(struct thread_options *o)
 	free(o->profile);
 	free(o->cgroup);
 
+	free(o->verify_pattern);
+	free(o->buffer_pattern);
+
 	for (i = 0; i < DDIR_RWDIR_CNT; i++) {
 		free(o->bssplit[i]);
 		free(o->zone_split[i]);
@@ -185,6 +188,10 @@ int convert_thread_options_to_cpu(struct thread_options *o,
 	    thread_options_pack_size(o) > top_sz)
 		return -EINVAL;
 
+	o->verify_pattern = realloc(o->verify_pattern,
+				    o->verify_pattern_bytes);
+	o->buffer_pattern = realloc(o->buffer_pattern,
+				    o->buffer_pattern_bytes);
 	memcpy(o->verify_pattern, top->patterns, o->verify_pattern_bytes);
 	memcpy(o->buffer_pattern, &top->patterns[o->verify_pattern_bytes],
 	       o->buffer_pattern_bytes);
@@ -651,8 +658,10 @@ int fio_test_cconv(struct thread_options *__o)
 	int ret;
 
 	o1.verify_pattern_bytes = 61;
+	o1.verify_pattern = malloc(o1.verify_pattern_bytes);
 	memset(o1.verify_pattern, 'V', o1.verify_pattern_bytes);
 	o1.buffer_pattern_bytes = 15;
+	o1.buffer_pattern = malloc(o1.buffer_pattern_bytes);
 	memset(o1.buffer_pattern, 'B', o1.buffer_pattern_bytes);
 
 	top_sz = thread_options_pack_size(&o1);
@@ -672,5 +681,7 @@ out:
 	free_thread_options_to_cpu(&o2);
 	free(top2);
 	free(top1);
+	free(o1.buffer_pattern);
+	free(o1.verify_pattern);
 	return ret;
 }
diff --git a/lib/pattern.c b/lib/pattern.c
index d324263c2b34..1ae05758f768 100644
--- a/lib/pattern.c
+++ b/lib/pattern.c
@@ -327,11 +327,11 @@ static const char *parse_format(const char *in, char *out, unsigned int parsed,
  *
  * Returns number of bytes filled or err < 0 in case of failure.
  */
-int parse_and_fill_pattern(const char *in, unsigned int in_len,
-			   char *out, unsigned int out_len,
-			   const struct pattern_fmt_desc *fmt_desc,
-			   struct pattern_fmt *fmt,
-			   unsigned int *fmt_sz_out)
+static int parse_and_fill_pattern(const char *in, unsigned int in_len,
+				  char *out, unsigned int out_len,
+				  const struct pattern_fmt_desc *fmt_desc,
+				  struct pattern_fmt *fmt,
+				  unsigned int *fmt_sz_out)
 {
 	const char *beg, *end, *out_beg = out;
 	unsigned int total = 0, fmt_rem = 0;
@@ -392,6 +392,48 @@ int parse_and_fill_pattern(const char *in, unsigned int in_len,
 	return total;
 }
 
+/**
+ * parse_and_fill_pattern_alloc() - Parses combined input, which consists of
+ *				    strings, numbers and pattern formats and
+ *				    allocates a buffer for the result.
+ *
+ * @in - string input
+ * @in_len - size of the input string
+ * @out - pointer to the output buffer pointer, this will be set to the newly
+ *        allocated pattern buffer which must be freed by the caller
+ * @fmt_desc - array of pattern format descriptors [input]
+ * @fmt - array of pattern formats [output]
+ * @fmt_sz - pointer where the size of pattern formats array stored [input],
+ *           after successful parsing this pointer will contain the number
+ *           of parsed formats if any [output].
+ *
+ * See documentation on parse_and_fill_pattern() above for a description
+ * of the functionality.
+ *
+ * Returns number of bytes filled or err < 0 in case of failure.
+ */
+int parse_and_fill_pattern_alloc(const char *in, unsigned int in_len,
+		char **out, const struct pattern_fmt_desc *fmt_desc,
+		struct pattern_fmt *fmt, unsigned int *fmt_sz_out)
+{
+	int count;
+
+	count = parse_and_fill_pattern(in, in_len, NULL, MAX_PATTERN_SIZE,
+				       fmt_desc, fmt, fmt_sz_out);
+	if (count < 0)
+		return count;
+
+	*out = malloc(count);
+	count = parse_and_fill_pattern(in, in_len, *out, count, fmt_desc,
+				       fmt, fmt_sz_out);
+	if (count < 0) {
+		free(*out);
+		*out = NULL;
+	}
+
+	return count;
+}
+
 /**
  * dup_pattern() - Duplicates part of the pattern all over the buffer.
  *
diff --git a/lib/pattern.h b/lib/pattern.h
index a6d9d6b4275c..7123b42d6747 100644
--- a/lib/pattern.h
+++ b/lib/pattern.h
@@ -1,6 +1,19 @@
 #ifndef FIO_PARSE_PATTERN_H
 #define FIO_PARSE_PATTERN_H
 
+/*
+ * The pattern is dynamically allocated, but that doesn't mean there
+ * are not limits. The network protocol has a limit of
+ * FIO_SERVER_MAX_CMD_MB and potentially two patterns must fit in there.
+ * There's also a need to verify the incoming data from the network and
+ * this provides a sensible check.
+ *
+ * 128MiB is an arbitrary limit that meets these criteria. The patterns
+ * tend to be truncated at the IO size anyway and IO sizes that large
+ * aren't terribly practical.
+ */
+#define MAX_PATTERN_SIZE	(128 << 20)
+
 /**
  * Pattern format description. The input for 'parse_pattern'.
  * Describes format with its name and callback, which should
@@ -21,11 +34,9 @@ struct pattern_fmt {
 	const struct pattern_fmt_desc *desc;
 };
 
-int parse_and_fill_pattern(const char *in, unsigned int in_len,
-			   char *out, unsigned int out_len,
-			   const struct pattern_fmt_desc *fmt_desc,
-			   struct pattern_fmt *fmt,
-			   unsigned int *fmt_sz_out);
+int parse_and_fill_pattern_alloc(const char *in, unsigned int in_len,
+		char **out, const struct pattern_fmt_desc *fmt_desc,
+		struct pattern_fmt *fmt, unsigned int *fmt_sz_out);
 
 int paste_format_inplace(char *pattern, unsigned int pattern_len,
 			 struct pattern_fmt *fmt, unsigned int fmt_sz,
diff --git a/options.c b/options.c
index 9e4d8cd1a92d..49612345b365 100644
--- a/options.c
+++ b/options.c
@@ -1488,8 +1488,8 @@ static int str_buffer_pattern_cb(void *data, const char *input)
 	int ret;
 
 	/* FIXME: for now buffer pattern does not support formats */
-	ret = parse_and_fill_pattern(input, strlen(input), td->o.buffer_pattern,
-				     MAX_PATTERN_SIZE, NULL, NULL, NULL);
+	ret = parse_and_fill_pattern_alloc(input, strlen(input),
+				&td->o.buffer_pattern, NULL, NULL, NULL);
 	if (ret < 0)
 		return 1;
 
@@ -1537,9 +1537,9 @@ static int str_verify_pattern_cb(void *data, const char *input)
 	int ret;
 
 	td->o.verify_fmt_sz = FIO_ARRAY_SIZE(td->o.verify_fmt);
-	ret = parse_and_fill_pattern(input, strlen(input), td->o.verify_pattern,
-				     MAX_PATTERN_SIZE, fmt_desc,
-				     td->o.verify_fmt, &td->o.verify_fmt_sz);
+	ret = parse_and_fill_pattern_alloc(input, strlen(input),
+			&td->o.verify_pattern, fmt_desc, td->o.verify_fmt,
+			&td->o.verify_fmt_sz);
 	if (ret < 0)
 		return 1;
 
diff --git a/stat.h b/stat.h
index 4c3bf71f3b92..8ceabc48c7e0 100644
--- a/stat.h
+++ b/stat.h
@@ -142,7 +142,6 @@ enum block_info_state {
 	BLOCK_STATE_COUNT,
 };
 
-#define MAX_PATTERN_SIZE	512
 #define FIO_JOBNAME_SIZE	128
 #define FIO_JOBDESC_SIZE	256
 #define FIO_VERROR_SIZE		128
diff --git a/thread_options.h b/thread_options.h
index d2897ac274bf..74e7ea45868b 100644
--- a/thread_options.h
+++ b/thread_options.h
@@ -144,7 +144,7 @@ struct thread_options {
 	unsigned int do_verify;
 	unsigned int verify_interval;
 	unsigned int verify_offset;
-	char verify_pattern[MAX_PATTERN_SIZE];
+	char *verify_pattern;
 	unsigned int verify_pattern_bytes;
 	struct pattern_fmt verify_fmt[8];
 	unsigned int verify_fmt_sz;
@@ -256,7 +256,7 @@ struct thread_options {
 	unsigned int zero_buffers;
 	unsigned int refill_buffers;
 	unsigned int scramble_buffers;
-	char buffer_pattern[MAX_PATTERN_SIZE];
+	char *buffer_pattern;
 	unsigned int buffer_pattern_bytes;
 	unsigned int compress_percentage;
 	unsigned int compress_chunk;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 5/6] lib/pattern: Support binary pattern buffers on windows
  2022-11-18 23:15 [PATCH v3 0/6] Support large pattern buffers Logan Gunthorpe
                   ` (3 preceding siblings ...)
  2022-11-18 23:15 ` [PATCH v3 4/6] options: Support arbitrarily long pattern buffers Logan Gunthorpe
@ 2022-11-18 23:16 ` Logan Gunthorpe
  2022-11-18 23:16 ` [PATCH v3 6/6] test: add large pattern test Logan Gunthorpe
  2022-11-19  0:37 ` [PATCH v3 0/6] Support large pattern buffers Vincent Fu
  6 siblings, 0 replies; 8+ messages in thread
From: Logan Gunthorpe @ 2022-11-18 23:16 UTC (permalink / raw)
  To: fio, Jens Axboe, Vincent Fu; +Cc: David Sloan, Stephen Bates, Logan Gunthorpe

On windows, binary files used as pattern buffers may be mangled or
truncated seeing the files are openned in text mode.

Fix this by passing O_BINARY on windows when openning the file.

Suggested-by: Vincent Fu <vincentfu@gmail.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 lib/pattern.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/lib/pattern.c b/lib/pattern.c
index 1ae05758f768..9be29af6bcf5 100644
--- a/lib/pattern.c
+++ b/lib/pattern.c
@@ -47,7 +47,11 @@ static const char *parse_file(const char *beg, char *out,
 	if (file == NULL)
 		goto err_out;
 
+#ifdef _WIN32
+	fd = open(file, O_RDONLY | O_BINARY);
+#else
 	fd = open(file, O_RDONLY);
+#endif
 	if (fd < 0)
 		goto err_free_out;
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 6/6] test: add large pattern test
  2022-11-18 23:15 [PATCH v3 0/6] Support large pattern buffers Logan Gunthorpe
                   ` (4 preceding siblings ...)
  2022-11-18 23:16 ` [PATCH v3 5/6] lib/pattern: Support binary pattern buffers on windows Logan Gunthorpe
@ 2022-11-18 23:16 ` Logan Gunthorpe
  2022-11-19  0:37 ` [PATCH v3 0/6] Support large pattern buffers Vincent Fu
  6 siblings, 0 replies; 8+ messages in thread
From: Logan Gunthorpe @ 2022-11-18 23:16 UTC (permalink / raw)
  To: fio, Jens Axboe, Vincent Fu; +Cc: David Sloan, Stephen Bates, Logan Gunthorpe

Add a test which writes a file with a 16KB buffer pattern, then
verify the file with the same pattern.

The test writes a single 16KB block and thus should be the same as
the patttern file. Verify that this is the case after the test is
run.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 t/jobs/t0027.fio   | 14 ++++++++++++++
 t/run-fio-tests.py | 29 +++++++++++++++++++++++++++++
 2 files changed, 43 insertions(+)
 create mode 100644 t/jobs/t0027.fio

diff --git a/t/jobs/t0027.fio b/t/jobs/t0027.fio
new file mode 100644
index 000000000000..b5b97a30d3d7
--- /dev/null
+++ b/t/jobs/t0027.fio
@@ -0,0 +1,14 @@
+[global]
+filename=t0027file
+size=16k
+bs=16k
+
+[write_job]
+readwrite=write
+buffer_pattern='t0027.pattern'
+
+[read_job]
+stonewall=1
+readwrite=read
+verify=pattern
+verify_pattern='t0027.pattern'
diff --git a/t/run-fio-tests.py b/t/run-fio-tests.py
index e5b307ac0db8..a06f812683cb 100755
--- a/t/run-fio-tests.py
+++ b/t/run-fio-tests.py
@@ -799,6 +799,26 @@ class FioJobTest_t0025(FioJobTest):
         if self.json_data['jobs'][0]['read']['io_kbytes'] != 128:
             self.passed = False
 
+class FioJobTest_t0027(FioJobTest):
+    def setup(self, *args, **kws):
+        super(FioJobTest_t0027, self).setup(*args, **kws)
+        self.pattern_file = os.path.join(self.test_dir, "t0027.pattern")
+        self.output_file = os.path.join(self.test_dir, "t0027file")
+        self.pattern = os.urandom(16 << 10)
+        with open(self.pattern_file, "wb") as f:
+            f.write(self.pattern)
+
+    def check_result(self):
+        super(FioJobTest_t0027, self).check_result()
+
+        if not self.passed:
+            return
+
+        with open(self.output_file, "rb") as f:
+            data = f.read()
+
+        if data != self.pattern:
+            self.passed = False
 
 class FioJobTest_iops_rate(FioJobTest):
     """Test consists of fio test job t0009
@@ -1214,6 +1234,15 @@ TEST_LIST = [
         'pre_success':      None,
         'requirements':     [Requirements.not_windows],
     },
+    {
+        'test_id':          27,
+        'test_class':       FioJobTest_t0027,
+        'job':              't0027.fio',
+        'success':          SUCCESS_DEFAULT,
+        'pre_job':          None,
+        'pre_success':      None,
+        'requirements':     [],
+    },
     {
         'test_id':          1000,
         'test_class':       FioExeTest,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 0/6] Support large pattern buffers
  2022-11-18 23:15 [PATCH v3 0/6] Support large pattern buffers Logan Gunthorpe
                   ` (5 preceding siblings ...)
  2022-11-18 23:16 ` [PATCH v3 6/6] test: add large pattern test Logan Gunthorpe
@ 2022-11-19  0:37 ` Vincent Fu
  6 siblings, 0 replies; 8+ messages in thread
From: Vincent Fu @ 2022-11-19  0:37 UTC (permalink / raw)
  To: Logan Gunthorpe, fio, Jens Axboe; +Cc: David Sloan, Stephen Bates

On 11/18/22 18:15, Logan Gunthorpe wrote:
> Hello,
> 
> This is v3 of this patchset which includes changes for the feedback
> thus far.
> 
> When testing storage platforms that have integrated compression it is
> useful to use specifically compressible real world data. The existing
> buffer_compress_percentage option doesn't work well for testing the
> performance of the compression algorithm portion of the stack seeing it
> results in random data (which can slow down many compression algorithms)
> followed by trivially compressible data (which can often be compressed
> at a higher performance than real world data).
> 
> Instead of the crude compressibility approximation when using
> buffer_compress_percentage, it would be useful to specify specific test
> data for benchmarking purposes. Using real world data or well-known
> test data like the Calgary Corpus would help to make the benchmarks
> closer to real world applications.
> 
> The existing buffer_pattern option would almost work for this, except it
> is limited to a maximum length of 512B and will be repeated in the
> IO buffer and thus can't be used for any arbitrary compression
> size seeing repeated data is easily compressed.
> 
> This patch set proposes allowing much larger buffer patterns,
> that may be loaded from a file. The first patch provides the
> infrastructure for passing arbitrary sized buffers in the client/server
> protocol (which means the server version number must be bumped), the
> second patch adds a method for pre-calculating the buffer size
> in parse_and_fill_pattern(), the third patch supports properly
> reading large files and the final patch enables dynamic allocation of the
> buffers themselves.
> 
> The buffer_pattern will still be truncated to the block size of
> the IO and repeated for every IO, but this should still be acceptable
> for testing compression in storage which tends to operate on
> fixed size chunks anyway.
> 
> An arbitrary limit on the pattern buffer of 128MB is still enforced to
> avoid issues that would crop up with the maximum limit of the
> client/server commands (FIO_SERVER_MAX_CMD_MB). This should be more than
> enough seeing it isn't useful to specify a buffer greater than the block
> size and blocksizes that large are not very practical.
> 
> Thanks,
> 
> Logan
> 
> --
> 
> Changes since v2:
>    - Fix missing signed off by on patch
>    - Add a patch to properly support binary patterns on windows
>      per Vincent and the CI run of the new test
> 
> Changes since v1:
>    - Added comment on the patterns member of struct thread_options_pack,
>      per Vincent
>    - Added a test in the suite to test a 16KB buffer and verify pattern
> 
> --
> 
> Logan Gunthorpe (6):
>    cconv: Support pattern buffers of arbitrary size
>    lib/pattern: Support NULL output buffer in parse_and_fill_pattern()
>    lib/pattern: Support short repeated read calls when loading from file
>    options: Support arbitrarily long pattern buffers
>    lib/pattern: Support binary pattern buffers on windows
>    test: add large pattern test
> 
>   cconv.c            |  86 +++++++++++++++++++++++++++-----------
>   client.c           |  17 +++++---
>   gclient.c          |  12 ++++--
>   lib/pattern.c      | 100 +++++++++++++++++++++++++++++++++++++--------
>   lib/pattern.h      |  21 +++++++---
>   options.c          |  10 ++---
>   server.c           |  23 +++++++----
>   server.h           |   2 +-
>   stat.h             |   1 -
>   t/jobs/t0027.fio   |  14 +++++++
>   t/run-fio-tests.py |  29 +++++++++++++
>   thread_options.h   |  15 ++++---
>   12 files changed, 257 insertions(+), 73 deletions(-)
>   create mode 100644 t/jobs/t0027.fio
> 
> 
> base-commit: 07c8fe21021681f86fbfd3c3d63b88a5ebd4e557
> --
> 2.30.2

Applied. Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-11-19  1:33 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-18 23:15 [PATCH v3 0/6] Support large pattern buffers Logan Gunthorpe
2022-11-18 23:15 ` [PATCH v3 1/6] cconv: Support pattern buffers of arbitrary size Logan Gunthorpe
2022-11-18 23:15 ` [PATCH v3 2/6] lib/pattern: Support NULL output buffer in parse_and_fill_pattern() Logan Gunthorpe
2022-11-18 23:15 ` [PATCH v3 3/6] lib/pattern: Support short repeated read calls when loading from file Logan Gunthorpe
2022-11-18 23:15 ` [PATCH v3 4/6] options: Support arbitrarily long pattern buffers Logan Gunthorpe
2022-11-18 23:16 ` [PATCH v3 5/6] lib/pattern: Support binary pattern buffers on windows Logan Gunthorpe
2022-11-18 23:16 ` [PATCH v3 6/6] test: add large pattern test Logan Gunthorpe
2022-11-19  0:37 ` [PATCH v3 0/6] Support large pattern buffers Vincent Fu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.